AlphaZero

Sep 5, 2025 Reinforcement Learning 28 min read

Reinforcement Learning (8): AlphaGo and Monte Carlo Tree Search

From MCTS to AlphaGo, AlphaGo Zero, AlphaZero, and MuZero. Understand UCT exploration-exploitation, self-play training, and planning with learned models. Includes a complete AlphaZero implementation for Gomoku.