NLP (6): GPT and Generative Language Models

Sun, 26 Oct 2025 09:00:00 +0000

When you ask ChatGPT a question and a fluent multi-paragraph answer streams back token by token, you are watching a single deceptively simple loop: feed everything-so-far into a Transformer decoder, look at the probability distribution it produces over the vocabulary, pick one token, append it, repeat. That is all an autoregressive language model does. The miracle is not the loop — it is what happens when you scale the network behind the loop to hundreds of billions of parameters and train it on most of the internet.

Language Models on Chen Kai Blog

NLP (6): GPT and Generative Language Models