DPLL and WalkSAT

The SAT Problem

Input: Propositional sentence in CNF. Output: A satisfying assignment (model) if one exists, or UNSATISFIABLE.

Connection to inference: KB ⊨ α iff KB ∧ ¬α is UNSAT. Any complete SAT algorithm gives complete inference.


DPLL Algorithm

DPLL is a complete, backtracking SAT algorithm with three key heuristics over naive enumeration.

function DPLL-SATISFIABLE?(s) returns true or false
    clauses ← CNF-clauses(s)
    symbols ← propositional symbols in s
    return DPLL(clauses, symbols, {})

function DPLL(clauses, symbols, model) returns true or false
    -- Early termination
    if every clause in clauses is true in model then return true
    if some clause in clauses is false in model then return false

    -- Unit propagation
    P, value ← FIND-UNIT-CLAUSE(clauses, model)
    if P is non-null then
        return DPLL(clauses, symbols - P, EXTEND(model, P, value))

    -- Pure symbol heuristic
    P, value ← FIND-PURE-SYMBOL(symbols, clauses, model)
    if P is non-null then
        return DPLL(clauses, symbols - P, EXTEND(model, P, value))

    -- Branching
    P ← FIRST(symbols); rest ← REST(symbols)
    return DPLL(clauses, rest, EXTEND(model, P, true))
        OR DPLL(clauses, rest, EXTEND(model, P, false))

Three Key Optimizations

1. Early Termination

2. Unit Propagation

A unit clause has exactly one unassigned literal (all others false in current model). That literal must be true — no branching needed.

Example: (¬A ∨ ¬B ∨ C) where A=T, B=T → unit clause (C) → C must be true.

Unit propagation cascades: setting one variable may create new unit clauses, which set more variables. This is the single most powerful heuristic in DPLL.

3. Pure Symbol Heuristic

A symbol P is pure if it appears with only one polarity in all unresolved clauses. - Pure positive (only P, never ¬P): set P=true - Pure negative (only ¬P, never P): set P=false

Setting a pure symbol greedily satisfies all its clauses and harms none.

DPLL Properties

Property Value
Sound Yes
Complete Yes
Terminates Yes
Complexity Exponential worst case; much better in practice

Modern Extensions (CDCL)

Modern SAT solvers (MiniSAT, Glucose, zChaff) extend DPLL with: - Conflict-driven clause learning (CDCL): when a conflict occurs, analyze the cause, add a learned clause that prevents the same conflict in all future branches - Non-chronological backtracking: jump to the actual cause of conflict, not just one level up - VSIDS heuristic: prefer variables involved in recent conflicts for branching

CDCL solvers handle industrial instances with millions of variables.


WalkSAT Algorithm

WalkSAT is an incomplete, randomized local search algorithm. Fast for satisfiable instances.

function WALKSAT(clauses, p, max_flips) returns model or failure
    model ← random truth-value assignment to all symbols
    for i = 1 to max_flips do
        if model satisfies all clauses then return model
        clause ← a randomly chosen UNSATISFIED clause
        with probability p:
            flip a randomly chosen symbol in clause       -- random walk
        else:
            flip the symbol in clause that maximizes      -- greedy
            the number of satisfied clauses
    return failure

Why the Mixed Strategy?

WalkSAT Properties

Property Value
Sound Yes — only returns a valid model
Complete No — cannot prove UNSAT
Best for Large satisfiable instances; random 3-SAT near phase transition
Worst case Unsatisfiable instances (exhausts max_flips)

DPLL vs. WalkSAT

Property DPLL/CDCL WalkSAT
Complete? Yes No
Handles UNSAT? Yes No
Search type Systematic backtracking Random local search
Best for UNSAT proofs, structured instances Large satisfiable instances
Modern descendant MiniSAT, Glucose SATenstein, WalkSAT variants

Rule of thumb: DPLL/CDCL for UNSAT proofs and formal verification. WalkSAT for large satisfiable problems where you just need a solution quickly.


Connection to Inference

To check KB ⊨ α: 1. Convert KB ∧ ¬α to CNF. 2. Run DPLL on the clause set. - DPLL returns UNSAT → KB ⊨ α (proved by refutation) - DPLL returns a model → KB ⊭ α (the model is a counterexample)

WalkSAT cannot prove entailment — it cannot prove UNSAT.


Phase Transition (3-SAT)

For random 3-SAT (each clause has exactly 3 literals), instances transition from easy-satisfiable to easy-unsatisfiable as the clause-to-variable ratio increases. The hardest instances cluster near the critical ratio ≈ 4.27. Both DPLL and WalkSAT slow dramatically near this transition.