Planning Algorithms and Heuristics
State-Space Search for Planning
The most direct approach: treat planning as state-space search (Ch.3) but exploit the explicit action representation.
Forward search (progression): - Start from initial state I - Apply actions whose preconditions are met - Search until goal is satisfied
Backward search (regression): - Start from goal G - Find actions whose effects include goal literals - Work backward to initial state
Regression
For a goal g and action a with effects that include literals in g:
Regress(g, a) = (g - ADD(a)) ∪ PRECOND(a)
Remove what a achieves, add a’s preconditions. This gives the subgoal that must hold before a.
Planning Heuristics (Ignore-Delete-List)
The key to practical planning: admissible heuristics derived from a relaxed problem.
Relaxed Problem (h+)
Ignore all delete effects — actions only add things, never remove them.
The relaxed plan length (h+) is an admissible heuristic: the relaxed problem is easier (monotone), so the relaxed plan cost ≤ real plan cost.
Computing h+ exactly is NP-hard, so approximations are used:
FF Heuristic (FastForward)
- Build a relaxed planning graph (ignoring deletes)
- Extract a relaxed plan using greedy backward extraction
- Return the number of actions in the relaxed plan as h
FastForward planner uses h_FF for greedy best-first search → very effective in practice.
HSP Heuristic
Propagate costs forward through the relaxed planning graph: -
h(p) = estimated cost to achieve proposition p -
h(a) = max or sum of h-values of preconditions
Max: h_max — ignores interaction
between preconditions; admissible Sum:
h_add — ignores interaction; inadmissible but effective
Planning Graphs (GraphPlan)
A planning graph is a directed graph alternating between: - Proposition layers P₀, P₁, P₂, … - Action layers A₀, A₁, A₂, …
Where Pₙ = all propositions reachable in n steps (relaxed — ignoring deletes).
Mutex Relations
Actions or propositions at the same level can be mutex (cannot both hold in any plan at this level): - Inconsistent effects: one action negates the other’s effect - Interference: one action deletes a precondition of the other - Competing needs: preconditions are mutex - Inconsistent support: all achievers of two propositions are mutex
GraphPlan Algorithm
- Expand planning graph level by level
- Search backward from goal for a conflict-free plan
- If plan found → done; if no-good → expand further
- If graph levels off (stabilizes) → problem is unsolvable
GraphPlan’s graph level = a lower bound on plan length → admissible heuristic.
SATPLAN
Encode planning as SAT (Ch.7) and use DPLL/CDCL:
For each time step t = 1..T: - ActionVar(a, t): boolean
— action a taken at time t - FluentVar(f, t): boolean —
fluent f holds at time t - Frame axioms: if f changes from t to t+1,
some action caused it - Precondition axioms: if a taken at t,
preconditions hold at t - Effect axioms: if a taken at t, effects hold
at t+1
Incrementally increase T until SAT finds a plan.
Used by: SATPLAN (Kautz & Selman, 1996) — competitive with state-of-the-art planners for optimal planning.
Partial-Order Planning (POP)
Instead of total-order plan, build a partial order — only specify ordering constraints that are required.
Advantages: - More flexible — separates what must be ordered from what can be interleaved - Natural for parallel execution
Causal link (aᵢ → p → aⱼ): action aᵢ
achieves precondition p for action aⱼ.
Threat: action aₖ deletes p, threatening the causal link. Resolved by: - Demotion: put aₖ before aᵢ - Promotion: put aₖ after aⱼ
POP was the dominant planning paradigm in the 1980s-90s; largely replaced by heuristic forward search for satisficing planning.
Hierarchical Task Networks (HTN)
Real-world planning often has hierarchical structure: - High-level task: “Travel to Paris” - Decomposes into: “Book flight” → “Go to airport” → “Board plane” → …
HTN planning: actions can be primitive (executable) or methods (decomposed into subtasks).
Method(Travel(a, b)):
IF Near(a, b) THEN [Drive(a, b)]
IF Far(a, b) THEN [Drive(a, Airport), Fly(Airport, b_Airport), Drive(b_Airport, b)]
HTN planners are very effective for real-world domains (logistics, robotics, game AI) where human knowledge encodes good decompositions.
Algorithm Comparison
| Algorithm | Optimal | Completeness | Best for |
|---|---|---|---|
| Forward search + h_FF | Satisficing | Complete | General domains |
| GraphPlan | Parallel-optimal | Complete | Structured domains |
| SATPLAN | Optimal | Complete | Finding shortest plan |
| HTN planning | Depends | Complete (if methods cover domain) | Hierarchical domains |
| POP | Satisficing | Complete | Parallel execution |