Reinforcement Learning (11): Hierarchical RL and Meta-Learning

Sat, 20 Sep 2025 09:00:00 +0000

Standard RL treats every problem as a flat sequence of atomic decisions: observe state, pick an action, receive a reward, repeat. That works when the horizon is short and rewards are dense, but it breaks down on the kind of tasks humans solve effortlessly. “Make breakfast” is not one decision; it is a tree of subtasks — brew coffee, fry eggs, toast bread, plate it up — each of which is itself a small policy. Hierarchical RL (HRL) lets agents reason and act at multiple timescales by treating macro-actions as first-class citizens.

Options Framework on Chen Kai Blog

Reinforcement Learning (11): Hierarchical RL and Meta-Learning