Planning Algorithms and Heuristics

State-Space Search for Planning

The most direct approach: treat planning as state-space search (Ch.3) but exploit the explicit action representation.

Forward search (progression): - Start from initial state I - Apply actions whose preconditions are met - Search until goal is satisfied

Backward search (regression): - Start from goal G - Find actions whose effects include goal literals - Work backward to initial state

Regression

For a goal g and action a with effects that include literals in g:

Regress(g, a) = (g - ADD(a)) ∪ PRECOND(a)

Remove what a achieves, add a’s preconditions. This gives the subgoal that must hold before a.

Planning Heuristics (Ignore-Delete-List)

The key to practical planning: admissible heuristics derived from a relaxed problem.

Relaxed Problem (h+)

Ignore all delete effects — actions only add things, never remove them.

The relaxed plan length (h+) is an admissible heuristic: the relaxed problem is easier (monotone), so the relaxed plan cost ≤ real plan cost.

Computing h+ exactly is NP-hard, so approximations are used:

FF Heuristic (FastForward)

Build a relaxed planning graph (ignoring deletes)
Extract a relaxed plan using greedy backward extraction
Return the number of actions in the relaxed plan as h

FastForward planner uses h_FF for greedy best-first search → very effective in practice.

HSP Heuristic

Propagate costs forward through the relaxed planning graph: - h(p) = estimated cost to achieve proposition p - h(a) = max or sum of h-values of preconditions

Max: h_max — ignores interaction between preconditions; admissible Sum: h_add — ignores interaction; inadmissible but effective

Planning Graphs (GraphPlan)

A planning graph is a directed graph alternating between: - Proposition layers P₀, P₁, P₂, … - Action layers A₀, A₁, A₂, …

Where Pₙ = all propositions reachable in n steps (relaxed — ignoring deletes).

Mutex Relations

Actions or propositions at the same level can be mutex (cannot both hold in any plan at this level): - Inconsistent effects: one action negates the other’s effect - Interference: one action deletes a precondition of the other - Competing needs: preconditions are mutex - Inconsistent support: all achievers of two propositions are mutex

GraphPlan Algorithm

Expand planning graph level by level
Search backward from goal for a conflict-free plan
If plan found → done; if no-good → expand further
If graph levels off (stabilizes) → problem is unsolvable

GraphPlan’s graph level = a lower bound on plan length → admissible heuristic.

SATPLAN

Encode planning as SAT (Ch.7) and use DPLL/CDCL:

For each time step t = 1..T: - ActionVar(a, t): boolean — action a taken at time t - FluentVar(f, t): boolean — fluent f holds at time t - Frame axioms: if f changes from t to t+1, some action caused it - Precondition axioms: if a taken at t, preconditions hold at t - Effect axioms: if a taken at t, effects hold at t+1

Incrementally increase T until SAT finds a plan.

Used by: SATPLAN (Kautz & Selman, 1996) — competitive with state-of-the-art planners for optimal planning.

Partial-Order Planning (POP)

Instead of total-order plan, build a partial order — only specify ordering constraints that are required.

Advantages: - More flexible — separates what must be ordered from what can be interleaved - Natural for parallel execution

Causal link (aᵢ → p → aⱼ): action aᵢ achieves precondition p for action aⱼ.

Threat: action aₖ deletes p, threatening the causal link. Resolved by: - Demotion: put aₖ before aᵢ - Promotion: put aₖ after aⱼ

POP was the dominant planning paradigm in the 1980s-90s; largely replaced by heuristic forward search for satisficing planning.

Hierarchical Task Networks (HTN)

Real-world planning often has hierarchical structure: - High-level task: “Travel to Paris” - Decomposes into: “Book flight” → “Go to airport” → “Board plane” → …

HTN planning: actions can be primitive (executable) or methods (decomposed into subtasks).

Method(Travel(a, b)):
  IF Near(a, b) THEN [Drive(a, b)]
  IF Far(a, b)  THEN [Drive(a, Airport), Fly(Airport, b_Airport), Drive(b_Airport, b)]

HTN planners are very effective for real-world domains (logistics, robotics, game AI) where human knowledge encodes good decompositions.

Algorithm Comparison

Algorithm	Optimal	Completeness	Best for
Forward search + h_FF	Satisficing	Complete	General domains
GraphPlan	Parallel-optimal	Complete	Structured domains
SATPLAN	Optimal	Complete	Finding shortest plan
HTN planning	Depends	Complete (if methods cover domain)	Hierarchical domains
POP	Satisficing	Complete	Parallel execution