Simulated Annealing

Motivation

Hill climbing never moves to worse states → gets stuck at local maxima. Simulated annealing escapes local maxima by occasionally accepting downhill moves, with acceptance probability decreasing over time.

Physical analogy: Annealing in metallurgy — heat metal (high energy = random), then cool slowly (energy decreases toward ground state). High temperature = explore; low temperature = exploit.

Algorithm

function SIMULATED-ANNEALING(problem, schedule) returns a state
    current ← INITIAL-STATE(problem)
    for t = 1, 2, ... do
        T ← schedule(t)          -- temperature at time t
        if T = 0 then return current
        next ← a randomly selected successor of current
        ΔE ← VALUE(next) - VALUE(current)
        if ΔE > 0 then
            current ← next        -- always accept improvement
        else
            current ← next  with probability e^(ΔE/T)

Key Mechanics

Temperature Schedule

schedule(t) maps time step → temperature T: - T starts high (lots of random moves accepted) - T decreases (fewer bad moves accepted) - T → 0: pure hill climbing

Common schedule: T(t) = T₀ / log(1 + t) or geometric cooling T(t) = αᵗ · T₀ where α ≈ 0.95.

Acceptance Probability

For a downhill move with ΔE < 0:

P(accept) = e^(ΔE/T)

Large |ΔE| (steep downhill) → lower acceptance probability
High T → acceptance probability near 1 (nearly random walk)
Low T → acceptance probability near 0 (nearly greedy)

At fixed T, this is exactly the Boltzmann distribution from statistical mechanics.

Convergence Guarantee

Theorem: If the temperature is decreased slowly enough (specifically, if T decreases no faster than T₀ / log(1 + t)), simulated annealing finds the global optimum with probability approaching 1.

Practical caveat: “slowly enough” often means astronomically long schedules. In practice, use faster cooling and accept near-optimal solutions.

Properties

Property	Value
Complete	Yes (with slow enough schedule)
Optimal	Yes (with slow enough schedule)
Practical	Use fast schedule → near-optimal quickly
Space	O(1)

Connection to DPLL / WalkSAT (Ch.7)

WalkSAT’s mixed greedy+random strategy is essentially simulated annealing with a fixed (not decreasing) temperature — the random walk component prevents local optima, the greedy component provides direction.

Applications

VLSI chip design
Airline crew scheduling
Protein folding
Neural hyperparameter optimization
DynamICCL: fine-tuning NCCL parameters where the objective (throughput) landscape is non-convex and hardware interactions create unexpected local optima