NLP

Modern NLP — language models, embeddings, transformers, and beyond.

12 articles

  1. 01

    NLP (1): Introduction and Text Preprocessing

    A first-principles introduction to NLP and text preprocessing. We trace the four eras of the field, build the cleaning …

    34 min
  2. 02

    NLP (2): Word Embeddings and Language Models

    Understand how Word2Vec, GloVe, and FastText turn words into vectors that capture meaning. Learn the math, train your …

    16 min
  3. 03

    NLP (3): RNN and Sequence Modeling

    How RNNs, LSTMs, and GRUs process sequences with memory. We derive vanishing gradients from first principles, build a …

    30 min
  4. 04

    NLP (4): Attention Mechanism and Transformer

    From the bottleneck of Seq2Seq to Attention Is All You Need. Build intuition for scaled dot-product attention, …

    34 min
  5. 05

    NLP (5): BERT and Pretrained Models

    How BERT made bidirectional pretraining the default in NLP. We unpack the architecture, the 80/10/10 masking rule, …

    32 min
  6. 06

    NLP (6): GPT and Generative Language Models

    From GPT-1 to GPT-4: understand autoregressive language modeling, decoding strategies (greedy, beam search, top-k, …

    32 min
  7. 07

    NLP (7): Prompt Engineering and In-Context Learning

    From prompt anatomy to chain-of-thought, self-consistency and ReAct: a working theory of in-context learning, the …

    36 min
  8. 08

    NLP (8): Model Fine-tuning and PEFT

    A deep dive into Parameter-Efficient Fine-Tuning. Why LoRA's low-rank update works, the math and memory accounting …

    18 min
  9. 09

    NLP (9): Deep Dive into LLM Architecture

    Inside modern LLMs: pre-norm + RMSNorm + SwiGLU + RoPE + GQA, KV cache mechanics, FlashAttention's IO-aware schedule, …

    32 min
  10. 10

    NLP (10): RAG and Knowledge Enhancement Systems

    Build production-grade RAG systems from first principles: the retrieve-then-generate decomposition, vector indexes …

    34 min
  11. 11

    NLP (11): Multimodal Large Language Models

    A deep dive into multimodal LLMs: contrastive vision-language pre-training with CLIP, parameter-efficient bridging with …

    32 min
  12. 12

    NLP (12): Frontiers and Practical Applications

    Series finale: agents and tool use (Function Calling, ReAct), code generation (Code Llama, Codex), long-context …

    36 min