Tagged

NLP

Nov 25, 2025 NLP 18 min read

NLP (12): Frontiers and Practical Applications

Series finale: agents and tool use (Function Calling, ReAct), code generation (Code Llama, Codex), long-context attention (Longformer, Infini-attention), reasoning models (o1, R1), safety and alignment, evaluation, and …

Nov 20, 2025 NLP 17 min read

NLP (11): Multimodal Large Language Models

A deep dive into multimodal LLMs: contrastive vision-language pre-training with CLIP, parameter-efficient bridging with BLIP-2's Q-Former, visual instruction tuning with LLaVA, robust speech recognition with Whisper, …

Nov 15, 2025 NLP 16 min read

NLP (10): RAG and Knowledge Enhancement Systems

Build production-grade RAG systems from first principles: the retrieve-then-generate decomposition, vector indexes (FAISS / Milvus / Chroma / Weaviate / Pinecone), dense+sparse hybrid retrieval with RRF, cross-encoder …

Nov 10, 2025 NLP 17 min read

NLP (9): Deep Dive into LLM Architecture

Inside modern LLMs: pre-norm + RMSNorm + SwiGLU + RoPE + GQA, KV cache mechanics, FlashAttention's IO-aware schedule, sparse Mixture-of-Experts, and INT8 / INT4 quantization.

Nov 5, 2025 NLP 15 min read

NLP (8): Model Fine-tuning and PEFT

A deep dive into Parameter-Efficient Fine-Tuning. Why LoRA's low-rank update works, the math and memory accounting behind QLoRA, how Adapters and Prefix-Tuning differ, and how to choose between them in production.

Oct 31, 2025 NLP 18 min read

NLP (7): Prompt Engineering and In-Context Learning

From prompt anatomy to chain-of-thought, self-consistency and ReAct: a working theory of in-context learning, the variance you have to fight, and the patterns that scale to real systems.

Oct 26, 2025 NLP 6 min read

NLP Part 6: GPT and Generative Language Models

From GPT-1 to GPT-4: understand autoregressive language modeling, decoding strategies (greedy, beam search, top-k, top-p), in-context learning, and build a chatbot with HuggingFace.

Oct 21, 2025 NLP 16 min read

NLP Part 5: BERT and Pretrained Models

How BERT made bidirectional pretraining the default in NLP. We unpack the architecture, the 80/10/10 masking rule, fine-tuning recipes, and the RoBERTa/ALBERT/ELECTRA family with HuggingFace code.

Oct 16, 2025 NLP 18 min read

NLP Part 4: Attention Mechanism and Transformer

From the bottleneck of Seq2Seq to Attention Is All You Need. Build intuition for scaled dot-product attention, multi-head attention, positional encoding, masking, and assemble a complete Transformer in PyTorch.

Oct 11, 2025 NLP 8 min read

NLP Part 3: RNN and Sequence Modeling

How RNNs, LSTMs, and GRUs process sequences with memory. We derive vanishing gradients from first principles, build a character-level text generator, and implement a Seq2Seq translator in PyTorch.

Oct 6, 2025 NLP 16 min read

NLP Part 2: Word Embeddings and Language Models

Understand how Word2Vec, GloVe, and FastText turn words into vectors that capture meaning. Learn the math, train your own embeddings with Gensim, and connect embeddings to language models.

Oct 1, 2025 NLP 18 min read

NLP Part 1: Introduction and Text Preprocessing

A first-principles introduction to NLP and text preprocessing. We trace the four eras of the field, build the cleaning to vectorization pipeline by hand, and unpack the math behind tokenization, TF-IDF, n-grams, and …

Sep 20, 2023 Standalone 6 min read

Position Encoding Brief: From Sinusoidal to RoPE and ALiBi

A practitioner's tour of Transformer position encoding: why attention needs it at all, how sinusoidal/learned/relative/RoPE/ALiBi schemes differ, and which one to pick when long-context extrapolation matters.

Aug 5, 2022 Standalone 21 min read

Multimodal LLMs and Downstream Tasks: A Practitioner's Guide

End-to-end map of multimodal LLMs: vision-language alignment, cross-modal fusion, the CLIP/BLIP/LLaVA families, downstream tasks (VQA, captioning, grounding, OCR), fine-tuning trade-offs, benchmarks, and what it takes to …