3. Search Framework and Data Structures
Source: AIMA 4th Ed., Chapter 3 (Section 3.3) Status: Complete
Overview
All search algorithms share a common framework: maintain a frontier of nodes to explore, expand the most promising node according to some evaluation function, and track which states have been reached to avoid redundant work. Understanding this framework makes every specific algorithm a special case.
The BEST-FIRST-SEARCH Framework
All algorithms in this chapter are instances of best-first search,
differing only in the evaluation function f(n).
function BEST-FIRST-SEARCH(problem, f) returns solution node or failure
node <- NODE(STATE=problem.INITIAL)
frontier <- priority queue ordered by f, with node as element
reached <- lookup table, key=problem.INITIAL, value=node
while not IS-EMPTY(frontier) do
node <- POP(frontier)
if problem.IS-GOAL(node.STATE) then return node
for each child in EXPAND(problem, node) do
s <- child.STATE
if s not in reached or child.PATH-COST < reached[s].PATH-COST then
reached[s] <- child
add child to frontier
return failure
The EXPAND function generates all children:
function EXPAND(problem, node) yields nodes
s <- node.STATE
for each action in problem.ACTIONS(s) do
s' <- problem.RESULT(s, action)
cost <- node.PATH-COST + problem.ACTION-COST(s, action, s')
yield NODE(STATE=s', PARENT=node, ACTION=action, PATH-COST=cost)
Node Data Structure
node.STATE -- world state this node represents
node.PARENT -- parent node (null for root)
node.ACTION -- action taken from parent to reach this node
node.PATH-COST -- g(n): cumulative cost from root to this node
Depth is not stored explicitly but can be computed:
depth(node) = depth(node.PARENT) + 1.
The path from root to a node is reconstructed by following PARENT pointers backward:
function PATH-ACTIONS(node) returns list of actions
if node.PARENT is null then return []
return PATH-ACTIONS(node.PARENT) + [node.ACTION]
The Frontier
The frontier (also called “open list”) holds nodes that have been generated but not yet expanded. Its data structure determines which node is popped next:
| Frontier Type | Pop Order | Used By |
|---|---|---|
| Priority queue (min-heap) | Minimum f(n) | Best-first, UCS, A* |
| FIFO queue | Oldest first (FIFO) | BFS |
| LIFO queue (stack) | Most recently added | DFS |
The Reached Table
The reached table (also called “closed list” or “explored set”) maps each visited state to the best node found for it so far. Its purpose:
- Avoid re-expansion: if we reach a state we have already fully explored, skip it.
- Detect better paths: if we find a cheaper path to an already-reached state, update and re-add to frontier.
reached <- hash table: state -> node
The condition for adding to frontier:
if s not in reached OR child.PATH-COST < reached[s].PATH-COST:
reached[s] <- child
add child to frontier
Redundant Paths and Cycles
Redundant path: two different paths reaching the same state. The more expensive one can be discarded.
Cycle: a path that visits a node already on its own path. Cycles make tree-like search loop forever.
Three strategies for handling redundant paths:
Remember all reached states (graph search): uses O(b^d) memory but prevents all redundant expansion. Optimal for most settings.
Check only for cycles (partial): check if the new state appears anywhere on the current path (i.e., is an ancestor). Uses O(m) memory per path. Used by DFS.
Ignore redundant paths (tree-like search): cheapest in memory but dangerous in graphs with cycles. Acceptable only when the state space is a tree (no cycles possible).
Graph Search vs. Tree-Like Search
| Property | Graph Search | Tree-Like Search |
|---|---|---|
| Tracks all reached states | Yes | No |
| Memory | O(b^d) for frontier + reached | O(b*m) for frontier only |
| Handles cycles | Yes | No (may loop infinitely) |
| May re-expand states | No | Yes |
| When to prefer | State space has many redundant paths | State space is a tree; memory is constrained |
IS-CYCLE Check
An efficient cycle check for depth-first style search: walk up the ancestor chain.
function IS-CYCLE(node) returns boolean
ancestor <- node.PARENT
while ancestor is not null do
if ancestor.STATE = node.STATE then return true
ancestor <- ancestor.PARENT
return false
Cost: O(depth) per check. For DFS, this is O(m) per node, which is acceptable given DFS’s O(m) depth limit.
Key Design Decisions in Any Search Algorithm
- What goes in f(n)? — This determines which node is expanded next.
- How is the frontier structured? — Priority queue, FIFO, LIFO.
- Do we track reached states? — Graph search vs. tree-like search.
- When do we check for goal? — At generation time (“early goal test”, used in BFS) or at expansion time (“late goal test”, used in UCS/A* for optimality guarantee).
The late goal test is critical for cost optimality: with early goal test, the first path to a goal state is returned immediately, but it may not be the cheapest path. Cost-optimal algorithms (UCS, A*) must check for goal only when the node is popped from the frontier (i.e., when it is guaranteed to be the cheapest path to that state).