Tagged

Vanishing Gradients

Feb 7, 2026 ML Math Derivations 12 min read

ML Math Derivations (19): Neural Networks and Backpropagation

How does a neural network learn? This article derives forward propagation, the chain rule mechanics of backpropagation, vanishing/exploding gradients, and initialization strategies (Xavier, He).