NLP

Nov 25, 2025 NLP 36 min read

NLP (12): Frontiers and Practical Applications

Series finale: agents and tool use (Function Calling, ReAct), code generation (Code Llama, Codex), long-context attention (Longformer, Infini-attention), reasoning models (o1, R1), safety and alignment, evaluation, and …

Nov 20, 2025 NLP 32 min read

NLP (11): Multimodal Large Language Models

A deep dive into multimodal LLMs: contrastive vision-language pre-training with CLIP, parameter-efficient bridging with BLIP-2's Q-Former, visual instruction tuning with LLaVA, robust speech recognition with Whisper, …

Nov 15, 2025 NLP 34 min read

NLP (10): RAG and Knowledge Enhancement Systems

Build production-grade RAG systems from first principles: the retrieve-then-generate decomposition, vector indexes (FAISS / Milvus / Chroma / Weaviate / Pinecone), dense+sparse hybrid retrieval with RRF, cross-encoder …

Nov 10, 2025 NLP 32 min read

NLP (9): Deep Dive into LLM Architecture

Inside modern LLMs: pre-norm + RMSNorm + SwiGLU + RoPE + GQA, KV cache mechanics, FlashAttention's IO-aware schedule, sparse Mixture-of-Experts, and INT8 / INT4 quantization.

Nov 5, 2025 NLP 18 min read

NLP (8): Model Fine-tuning and PEFT

A deep dive into Parameter-Efficient Fine-Tuning. Why LoRA's low-rank update works, the math and memory accounting behind QLoRA, how Adapters and Prefix-Tuning differ, and how to choose between them in production.

Oct 31, 2025 NLP 36 min read

NLP (7): Prompt Engineering and In-Context Learning

From prompt anatomy to chain-of-thought, self-consistency and ReAct: a working theory of in-context learning, the variance you have to fight, and the patterns that scale to real systems.

Oct 26, 2025 NLP 32 min read

NLP (6): GPT and Generative Language Models

From GPT-1 to GPT-4: understand autoregressive language modeling, decoding strategies (greedy, beam search, top-k, top-p), in-context learning, and build a chatbot with HuggingFace.

Oct 21, 2025 NLP 32 min read

NLP (5): BERT and Pretrained Models

How BERT made bidirectional pretraining the default in NLP. We unpack the architecture, the 80/10/10 masking rule, fine-tuning recipes, and the RoBERTa/ALBERT/ELECTRA family with HuggingFace code.

Oct 16, 2025 NLP 34 min read

NLP (4): Attention Mechanism and Transformer

From the bottleneck of Seq2Seq to Attention Is All You Need. Build intuition for scaled dot-product attention, multi-head attention, positional encoding, masking, and assemble a complete Transformer in PyTorch.

Oct 11, 2025 NLP 30 min read

NLP (3): RNN and Sequence Modeling

How RNNs, LSTMs, and GRUs process sequences with memory. We derive vanishing gradients from first principles, build a character-level text generator, and implement a Seq2Seq translator in PyTorch.

Oct 6, 2025 NLP 16 min read

NLP (2): Word Embeddings and Language Models

Understand how Word2Vec, GloVe, and FastText turn words into vectors that capture meaning. Learn the math, train your own embeddings with Gensim, and connect embeddings to language models.

Oct 1, 2025 NLP 34 min read

NLP (1): Introduction and Text Preprocessing

A first-principles introduction to NLP and text preprocessing. We trace the four eras of the field, build the cleaning to vectorization pipeline by hand, and unpack the math behind tokenization, TF-IDF, n-grams, and …

Jun 30, 2023 Standalone 12 min read

Position Encoding Brief: From Sinusoidal to RoPE and ALiBi

A practitioner's tour of Transformer position encoding: why attention needs it at all, how sinusoidal/learned/relative/RoPE/ALiBi schemes differ, and which one to pick when long-context extrapolation matters.

Apr 9, 2022 Standalone 40 min read

Multimodal LLMs and Downstream Tasks: A Practitioner's Guide

End-to-end map of multimodal LLMs: vision-language alignment, cross-modal fusion, the CLIP/BLIP/LLaVA families, downstream tasks (VQA, captioning, grounding, OCR), fine-tuning trade-offs, benchmarks, and what it takes to …