Reinforcement Learning (9): Multi-Agent Reinforcement Learning

Wed, 10 Sep 2025 09:00:00 +0000

Single-agent RL rests on one quiet but enormous assumption: the environment is stationary. The transition kernel does not change while the agent learns. The moment a second learner shares the world, that assumption collapses. Each agent now sees an environment whose dynamics shift as its peers update, rewards become entangled across agents, and the joint action space explodes combinatorially. These are not engineering nuisances. They are the reason multi-agent RL needs its own algorithms instead of just running DQN n times in parallel.

QMIX on Chen Kai Blog

Reinforcement Learning (9): Multi-Agent Reinforcement Learning