Population-Based Methods: Beam Search and Evolutionary Algorithms

Instead of tracking one current state, track k states in parallel.

function LOCAL-BEAM-SEARCH(problem, k) returns a state
    states ← k randomly generated states
    loop do
        if any state in states is a goal then return that state
        successors ← all successors of all k states
        states ← top k states from successors (by VALUE)

Key Distinction from k Independent Restarts

In k independent random-restart searches, each search has no information about the others.

In local beam search, the k states share information: - If one state generates many good successors, they may crowd out the others - Information about good regions propagates across the population

Problem: Lack of Diversity

All k states tend to cluster in the same region of the state space (they all descended from the best early states).

Fix: Stochastic beam search — select k successors probabilistically, weighted by value, rather than always taking the top k. Maintains diversity.


Evolutionary Algorithms

Evolutionary algorithms maintain a population of candidate states and apply biologically-inspired operators to produce the next generation.

Genetic Algorithm (GA) — Core Variant

function GENETIC-ALGORITHM(population, FITNESS) returns an individual
    repeat
        new_population ← {}
        for i = 1 to SIZE(population) do
            x ← RANDOM-SELECTION(population, FITNESS)
            y ← RANDOM-SELECTION(population, FITNESS)
            child ← REPRODUCE(x, y)
            if small random probability then child ← MUTATE(child)
            add child to new_population
        population ← new_population
    until some individual is fit enough or time elapsed
    return best individual in population

REPRODUCE (Crossover)

function REPRODUCE(x, y) returns an individual
    n ← LENGTH(x)
    c ← random number from 1 to n    -- crossover point
    return APPEND(x[1..c], y[c+1..n])

Single-point crossover: take prefix from parent x, suffix from parent y.

Operators

Operator Description Purpose
Selection Pick parents proportional to fitness Exploitation — favor good states
Crossover Combine two parents at a cut point Recombination of good partial solutions
Mutation Randomly flip a bit/gene Exploration — maintain diversity

Fitness-Proportionate Selection

P(select individual i) = fitness(i) / Σ fitness(j)


Comparison

Method Population Information sharing Diversity mechanism
Hill climbing 1 None Restart
Local beam search k Yes (best k survive) Stochastic beam
Genetic algorithm k Yes (crossover) Mutation + diverse parents

Why GAs Work (Schema Theory)

A schema is a template over the bit string (e.g., 1**0*1). GAs implicitly process O(n³) schemas while evaluating n individuals — the implicit parallelism of genetic algorithms.

Short, high-fitness schemas (building blocks) are preserved and combined by crossover. This is the intuition for why crossover can find good solutions faster than mutation alone.


Limitations


Connection to RL

Evolutionary algorithms = policy search in RL (search over the policy space directly). They are used when gradient-based methods fail (non-differentiable policies, discrete action spaces). Modern variant: Neuroevolution of Augmenting Topologies (NEAT).