Robotics
Chapter 26 — Robotics Book: Artificial Intelligence: A Modern Approach (Russell & Norvig, 4th ed) Pages: 1004–1055
Robotics and AI
Robotics applies AI to physical agents operating in the real world. Challenges: - Continuous state/action spaces: position, velocity, force - Noisy sensors: cameras, LIDAR, IMUs - Partial observability: unknown environment - Real-time constraints: must act in milliseconds - Physical consequences: mistakes are costly
Robot Hardware
Manipulators: robotic arms with degrees of freedom (DOF). Mobile robots: wheels, legs, aerial, underwater. Effectors: grippers, tools, force sensors.
Sensors: - Proprioceptive: encoders (joint angles), IMU (acceleration/rotation) - Exteroceptive: cameras, LIDAR, sonar, touch
Kinematics
Forward kinematics: given joint angles θ, compute end-effector position (x,y,z,orientation).
For a serial chain:
T_end = T₁(θ₁) · T₂(θ₂) · ... · Tₙ(θₙ)
Using homogeneous transformation matrices.
Inverse kinematics (IK): given desired (x,y,z), find joint angles θ. - Closed-form: possible for simple chains (6-DOF manipulators with certain geometries) - Numerical: Jacobian pseudoinverse, Newton-Raphson
Configuration Space (C-space)
The configuration space C maps robot DOF to a single point.
C-space obstacle: set of configurations that cause collision.
Planning in C-space reduces robot motion planning to a point-to-point planning problem.
Motion Planning
Sampling-Based Methods
PRM (Probabilistic Roadmap): 1. Sample random configurations 2. Connect nearby configurations (if collision-free) 3. Query: shortest path on the roadmap
RRT (Rapidly-exploring Random Tree):
repeat until goal reached:
q_rand ← random configuration
q_near ← nearest node in tree
q_new ← extend q_near toward q_rand (step size δ)
if edge (q_near, q_new) is collision-free: add to tree
RRT-Connect: grow two trees from start and goal; connect when close.
RRT* (asymptotically optimal): rewire tree to minimize path cost.
Optimization-Based
Trajectory optimization: minimize cost functional (energy, time, jerk) subject to collision constraints.
CHOMP (Covariant Hamiltonian Optimization for Motion Planning), STOMP.
Localization and Mapping
Localization: given a map, find robot position. - Kalman filter / particle filter (Ch.14) with sensor model P(e|s)
SLAM (Simultaneous Localization and Mapping): build the map while localizing. - EKF-SLAM: O(n²) per step; landmark-based - Particle filter SLAM (FastSLAM): O(n log n) per step - Graph SLAM: optimize trajectory + map jointly (g2o, GTSAM) - Dense SLAM: 3D volumetric map (KinectFusion, ORB-SLAM3)
Control
PID controller: proportional-integral-derivative:
u(t) = Kp·e(t) + Ki·∫e dt + Kd·de/dt
Model Predictive Control (MPC): at each step, solve finite-horizon optimization; execute first step only; replan.
LQR (Linear Quadratic Regulator): optimal controller for linear dynamics + quadratic cost. Solved by Riccati equation.
Deep Learning for Robotics
End-to-end learning: raw sensor input → actions (e.g., image → steering angle).
Imitation learning: learn from demonstrations (behavioral cloning, DAgger).
Robot RL: PPO/SAC for continuous control (Mujoco benchmarks); sim-to-real transfer.
Foundation models for robotics: RT-2 (robot transformer), PaLM-E — generalize across tasks.
Connection to DynamICCL
Robotics is the closest domain to DynamICCL in terms of methodology: - Both use continuous control under uncertainty - Both use RL with continuous action spaces (SAC, PPO) - Both deal with partial observability (belief state / POMDP) - Sim-to-real transfer parallels DynamICCL’s challenge of transferring NCCL policies learned in simulation to real GPU clusters with different network characteristics - MPC: plan NCCL parameter adjustments over a short horizon using a learned model — exact analog of robotic MPC