Robotics

Chapter 26 — Robotics Book: Artificial Intelligence: A Modern Approach (Russell & Norvig, 4th ed) Pages: 1004–1055

Robotics and AI

Robotics applies AI to physical agents operating in the real world. Challenges: - Continuous state/action spaces: position, velocity, force - Noisy sensors: cameras, LIDAR, IMUs - Partial observability: unknown environment - Real-time constraints: must act in milliseconds - Physical consequences: mistakes are costly

Robot Hardware

Manipulators: robotic arms with degrees of freedom (DOF). Mobile robots: wheels, legs, aerial, underwater. Effectors: grippers, tools, force sensors.

Sensors: - Proprioceptive: encoders (joint angles), IMU (acceleration/rotation) - Exteroceptive: cameras, LIDAR, sonar, touch

Kinematics

Forward kinematics: given joint angles θ, compute end-effector position (x,y,z,orientation).

For a serial chain:

T_end = T₁(θ₁) · T₂(θ₂) · ... · Tₙ(θₙ)

Using homogeneous transformation matrices.

Inverse kinematics (IK): given desired (x,y,z), find joint angles θ. - Closed-form: possible for simple chains (6-DOF manipulators with certain geometries) - Numerical: Jacobian pseudoinverse, Newton-Raphson

Configuration Space (C-space)

The configuration space C maps robot DOF to a single point.

C-space obstacle: set of configurations that cause collision.

Planning in C-space reduces robot motion planning to a point-to-point planning problem.

Motion Planning

Sampling-Based Methods

PRM (Probabilistic Roadmap): 1. Sample random configurations 2. Connect nearby configurations (if collision-free) 3. Query: shortest path on the roadmap

RRT (Rapidly-exploring Random Tree):

repeat until goal reached:
    q_rand ← random configuration
    q_near ← nearest node in tree
    q_new ← extend q_near toward q_rand (step size δ)
    if edge (q_near, q_new) is collision-free: add to tree

RRT-Connect: grow two trees from start and goal; connect when close.

RRT* (asymptotically optimal): rewire tree to minimize path cost.

Optimization-Based

Trajectory optimization: minimize cost functional (energy, time, jerk) subject to collision constraints.

CHOMP (Covariant Hamiltonian Optimization for Motion Planning), STOMP.

Localization and Mapping

Localization: given a map, find robot position. - Kalman filter / particle filter (Ch.14) with sensor model P(e|s)

SLAM (Simultaneous Localization and Mapping): build the map while localizing. - EKF-SLAM: O(n²) per step; landmark-based - Particle filter SLAM (FastSLAM): O(n log n) per step - Graph SLAM: optimize trajectory + map jointly (g2o, GTSAM) - Dense SLAM: 3D volumetric map (KinectFusion, ORB-SLAM3)

Control

PID controller: proportional-integral-derivative:

u(t) = Kp·e(t) + Ki·∫e dt + Kd·de/dt

Model Predictive Control (MPC): at each step, solve finite-horizon optimization; execute first step only; replan.

LQR (Linear Quadratic Regulator): optimal controller for linear dynamics + quadratic cost. Solved by Riccati equation.

Deep Learning for Robotics

End-to-end learning: raw sensor input → actions (e.g., image → steering angle).

Imitation learning: learn from demonstrations (behavioral cloning, DAgger).

Robot RL: PPO/SAC for continuous control (Mujoco benchmarks); sim-to-real transfer.

Foundation models for robotics: RT-2 (robot transformer), PaLM-E — generalize across tasks.

Connection to DynamICCL

Robotics is the closest domain to DynamICCL in terms of methodology: - Both use continuous control under uncertainty - Both use RL with continuous action spaces (SAC, PPO) - Both deal with partial observability (belief state / POMDP) - Sim-to-real transfer parallels DynamICCL’s challenge of transferring NCCL policies learned in simulation to real GPU clusters with different network characteristics - MPC: plan NCCL parameter adjustments over a short horizon using a learned model — exact analog of robotic MPC