11. Risks and Benefits of AI

Source: AIMA 4th Ed, §1.5


The Core Framing

Francis Bacon (1609): “Mechanical arts are of ambiguous use, serving as well for hurt as for remedy.”

AI is no different. Its benefits can be enormous — but so can its harms. Understanding both is essential for anyone building AI systems.


Benefits

At the largest scale: our civilization is the product of human intelligence. Substantially greater machine intelligence raises the ceiling on everything we can accomplish.

Potential benefits: - Free humanity from menial, repetitive work → increase production of goods and services - Dramatically accelerate scientific research → cures for disease, solutions to climate change - Expand access to expertise (medical, legal, educational) globally - Enable new forms of art, discovery, and creativity

Demis Hassabis (Google DeepMind): > “First solve AI, then use AI to solve everything else.”


Near-Term Risks (Already Apparent)

1. Lethal Autonomous Weapons (LAWs)

2. Surveillance and Persuasion

3. Biased Decision Making

4. Impact on Employment

5. Safety-Critical Applications

6. Cybersecurity


Long-Term Risks: AGI and Superintelligence

AGI (Artificial General Intelligence)

ASI (Artificial Superintelligence)

The Gorilla Problem

Humans and gorillas diverged ~7 million years ago. Today, gorillas have essentially no control over their future — it’s entirely determined by what humans choose to do.

If ASI is created, humans may be in the same position: our future determined by the AI’s goals, not ours.

Turing (1951): “It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers… we should have to expect the machines to take control.”

The King Midas Problem

An AI given a fixed objective will pursue it — exactly as specified — without regard for unintended consequences.

Midas asked that everything he touch turn to gold. He got exactly what he asked for. His food, drink, and family became gold.

This is the value alignment problem applied: misspecified objectives lead to catastrophic outcomes.

The Solution: Machines Uncertain About Objectives

Rather than giving machines fixed objectives, design machines that: - Are uncertain about what humans want - Observe human behavior to learn preferences - Are deferential — willing to be switched off precisely because they’re uncertain - Cannot be fully aligned until they’ve adequately learned human values

This framework connects to: - Assistance games (AIMA Ch. 18) - Inverse reinforcement learning (AIMA Ch. 22) - AI safety research more broadly


Summary Table

Risk Nature Mitigation
Lethal autonomous weapons Existential, scalability International regulation, treaties
Surveillance Democracy, privacy Legal frameworks, privacy-by-design
Algorithmic bias Fairness, discrimination Auditing, diverse datasets, fairness constraints
Employment disruption Economic inequality Policy, retraining programs
Safety-critical failures Physical harm Formal verification, conservative deployment
Cybersecurity Arms race Defensive AI, detection systems
AGI/ASI misalignment Civilizational Value alignment research, beneficial AI design