# AlphaZero
AlphaZero (2017) is the [[Google DeepMind]] system that generalized the [[AlphaGo]] Zero approach to play Go, chess, and shogi at superhuman level using a single algorithm, learning each game entirely from self-play with no prior human knowledge beyond the rules.
## What was new
- **Game-agnostic architecture**: the same neural network and search algorithm, unchanged, learned three different games
- **No human data**: trained purely via self-play reinforcement learning starting from random play
- **Fast convergence**: in chess, AlphaZero reached superhuman strength in hours; defeated Stockfish (the strongest open-source chess engine at the time) in a 100-game match
- **Qualitatively different play**: in chess especially, commentators described AlphaZero's style as more "positional" and sacrifice-prone than the calculation-heavy style of traditional engines
## Why it matters
- Showed that a single self-play + deep-RL recipe generalizes across distinct games, not just Go
- Established the "self-play bootstrap to superhuman" pattern later extended by [[MuZero]] to domains where the rules are unknown
- Strengthened the case for RL + search as a route to strong narrow-AI systems, complementary to the later LLM wave
## References
- Science paper (2018): https://www.science.org/doi/10.1126/science.aar6404
- Preprint (2017): https://arxiv.org/abs/1712.01815
- DeepMind blog: https://deepmind.google/discover/blog/alphazero-shedding-new-light-on-chess-shogi-and-go/
## Related
- [[AlphaGo]]
- [[MuZero]]
- [[Google DeepMind]]
- [[Demis Hassabis]]
- [[Artificial Intelligence (AI)]]