4. Properties of Task Environments

Source: AIMA 4th Ed, Chapter 2 (Section 2.3.2), physical PDF pp. 111–118

Introduction

Different task environments impose fundamentally different challenges on agent design. Section 2.3.2 introduces a taxonomy of environment properties — six binary (or near-binary) dimensions along which any environment can be characterized. These dimensions determine which agent architectures and algorithms are appropriate.

This taxonomy is one of the most practically important frameworks in introductory AI — it is the bridge between problem description (PEAS) and algorithm selection.

The Six Dimensions

1. Fully Observable vs. Partially Observable

Fully observable: The agent’s sensors give it access to the complete state of the environment at each point in time — or at least all aspects of the state that are relevant to the choice of action (relevance depends on the performance measure).

Partially observable: The agent cannot see the complete relevant state, either because: - Sensors are noisy or inaccurate, or - Parts of the state are simply missing from sensor data

Example: a vacuum agent with only a local dirt sensor cannot tell whether other squares are dirty.

Unobservable: The agent has no sensors at all. Even then, goals may still be achievable — see Chapter 4.

Design implication: Fully observable environments allow the agent to make decisions based solely on the current percept, with no need to maintain internal memory about the world. Partially observable environments require the agent to maintain an internal belief state about unobserved parts of the world.

RL/DynamICCL connection: NCCL parameter tuning is partially observable — the agent cannot directly observe all the factors affecting collective communication performance (network contention, PCIe bottlenecks, OS scheduling jitter).

2. Single-Agent vs. Multiagent

Single-agent: The agent operates alone. Example: a crossword puzzle solver — the grid is not “trying” to obstruct the agent.

Multiagent: Multiple agents whose actions affect each other’s performance measures.

Key criterion for multiagent: Entity B should be modeled as an agent (not just a physical object) if B’s behavior is best described as maximizing a performance measure that depends on A’s behavior.

Competitive multiagent: One agent’s gain is another’s loss. Example: chess — the opponent is trying to minimize your performance measure. In competitive environments, randomized behavior can be rational because it avoids being predictable.

Cooperative multiagent: Avoiding collisions maximizes all agents’ performance measures. Example: multi-lane traffic — no single agent benefits from collisions.

Partially competitive: Many real environments are partially cooperative and partially competitive. Example: taxi driving — avoiding accidents is cooperative, but competing for a parking space is competitive.

Design implication: Multiagent problems require reasoning about the behavior of other agents, which introduces game theory, communication protocols, and emergent behavior not present in single-agent design.

3. Deterministic vs. Nondeterministic (and Stochastic)

Deterministic: The next state of the environment is completely determined by the current state and the action executed by the agent(s). No uncertainty about outcomes.

Nondeterministic: The next state is not fully determined — multiple outcomes are possible. The possibilities are listed without probabilities.

Stochastic: A refinement of nondeterministic — outcome probabilities are explicitly quantified (e.g., “there’s a 25% chance of rain tomorrow” vs. “there might be rain tomorrow”).

Distinction: - Nondeterministic: “the action might fail” (no probability assigned) - Stochastic: “the action fails with probability 0.1” (probability assigned)

Design implication: Deterministic environments require no uncertainty handling. Nondeterministic environments require reasoning about contingencies. Stochastic environments require probabilistic reasoning and expected-value optimization.

Real-world note: Even if an environment is technically deterministic, if it is partially observable it will appear nondeterministic to the agent (because unobserved state differences lead to apparently unpredictable outcomes). Taxi driving is effectively stochastic.

4. Episodic vs. Sequential

Episodic: The agent’s experience is divided into atomic episodes. In each episode, the agent receives a percept and performs a single action. Crucially, the current episode does not depend on actions taken in previous episodes, and the current decision does not affect future episodes.

Examples of episodic tasks: - Defective part detection on an assembly line: each part is judged independently - Image classification: each image is classified independently

Sequential: The current decision could affect all future decisions. Short-term actions have long-term consequences.

Examples of sequential tasks: - Chess: each move affects future legal moves and game outcome - Taxi driving: every steering decision affects subsequent safety and efficiency

Design implication: Episodic environments are simpler — the agent does not need to plan ahead or reason about future consequences. Sequential environments require planning, search, or value function estimation over time.

RL connection: Virtually all RL formulations are sequential. The MDP framework (Ch. 17) is the canonical model for sequential decision-making.

5. Static vs. Dynamic

Static: The environment does not change while the agent is deliberating (deciding what action to take). The agent can take as long as it needs to think without the world moving on.

Examples: crossword puzzles, static planning problems.

Dynamic: The environment changes while the agent deliberates. The world keeps moving; if the agent has not acted by a deadline, that inaction itself counts as a decision (doing nothing).

Semidynamic: The environment itself does not change over time, but the agent’s performance score does (typically due to time penalties). Example: chess with a clock — the board doesn’t change while you think, but your time runs out.

Environment	Static/Dynamic
Crossword puzzle	Static
Chess with clock	Semidynamic
Taxi driving	Dynamic
Medical diagnosis (single visit)	Static (arguably)

Design implication: Dynamic environments require real-time or anytime algorithms — the agent cannot afford unbounded deliberation time. Static environments allow deep offline search.

6. Discrete vs. Continuous

This distinction applies to three separate aspects of the environment:

State: Is the environment state drawn from a finite set, or does it vary continuously?
Time: Does the problem proceed in discrete time steps, or continuously?
Percepts and actions: Are the percepts and actions drawn from finite sets, or are they continuous-valued?

Environment	State	Time	Actions
Chess	Discrete (finite positions)	Discrete	Discrete (finite legal moves)
Taxi driving	Continuous (position, velocity)	Continuous	Continuous (steering angle, brake pressure)
Image analysis	Continuous	Discrete (frame rate)	Discrete (output category)

Design implication: Discrete environments allow exact enumeration and classical search/planning methods. Continuous environments require approximation methods, function approximators, or discretization.

Bonus: Known vs. Unknown

This is listed separately because it is not strictly a property of the environment but of the agent’s knowledge about the environment:

Known: The agent (or designer) knows the “laws of physics” of the environment — the outcomes (or outcome probabilities) of all actions are given.

Unknown: The agent does not know how the environment works and must discover this through interaction.

Critical nuance: Known/unknown is orthogonal to fully/partially observable: - A known environment can still be partially observable (e.g., solitaire — you know the rules but can’t see the undealt cards) - An unknown environment can still be fully observable (e.g., a new video game — you can see the whole screen, but you don’t know what the buttons do)

Design implication: Unknown environments require the agent to learn or explore before it can act optimally. This is the essence of model-based RL.

Environment Classification Table (Figure 2.6)

Task Environment	Observable	Agents	Deterministic	Episodic	Static	Discrete
Crossword puzzle	Fully	Single	Deterministic	Sequential	Static	Discrete
Chess with a clock	Fully	Multi	Deterministic	Sequential	Semi	Discrete
Poker	Partially	Multi	Stochastic	Sequential	Static	Discrete
Backgammon	Fully	Multi	Stochastic	Sequential	Static	Discrete
Taxi driving	Partially	Multi	Stochastic	Sequential	Dynamic	Continuous
Medical diagnosis	Partially	Single	Stochastic	Sequential	Dynamic	Continuous
Image analysis	Fully	Single	Deterministic	Episodic	Semi	Continuous
Part-picking robot	Partially	Single	Stochastic	Episodic	Dynamic	Continuous
Refinery controller	Partially	Single	Stochastic	Sequential	Dynamic	Continuous
English tutor	Partially	Multi	Stochastic	Sequential	Dynamic	Discrete

The hardest case: partially observable, multiagent, nondeterministic, sequential, dynamic, continuous, and unknown. Taxi driving is hard in all these senses, except the driver’s environment is mostly known.

Important caveat: These classifications are not always cut and dried. Medical diagnosis could be episodic (diagnose given symptoms) or sequential (run tests over time, manage multiple patients). Context determines the appropriate classification.

Design Implications Summary

Property	Simpler end	Harder end	What it demands
Observability	Fully observable	Partially observable	Internal belief state; state estimation
Agents	Single	Multi	Game theory; communication; adversarial reasoning
Determinism	Deterministic	Stochastic	Probabilistic reasoning; expected value optimization
Temporality	Episodic	Sequential	Planning; lookahead; value functions
Dynamics	Static	Dynamic	Real-time algorithms; anytime computation
State space	Discrete	Continuous	Function approximation; continuous optimization
Knowledge	Known	Unknown	Learning; exploration; model building

Cross-References

Section 2.4 → Agent architecture is chosen based on environment properties
Chapter 3–5 → Search algorithms for deterministic, fully observable, single-agent, discrete
Chapter 4 → Partially observable environments; belief states
Chapter 17 → MDPs: sequential, stochastic, known environments
Chapter 22 → RL: sequential, stochastic, unknown environments (model-free)
DynamICCL → NCCL tuning is: partially observable, single-agent (or multi-process), stochastic, sequential, dynamic, continuous, partially unknown