LoRA

Mar 30, 2026 LLM Engineering 52 min read

LLM Engineering (4): Post-training — SFT, DPO, RLHF, RLAIF

What SFT, DPO, RLHF, and RLAIF each actually optimize, when reward models fail, KL constraints, the LoRA-vs-full-FT debate, and the production post-training recipes that ship in 2026.

Nov 5, 2025 NLP 18 min read

NLP (8): Model Fine-tuning and PEFT

A deep dive into Parameter-Efficient Fine-Tuning. Why LoRA's low-rank update works, the math and memory accounting behind QLoRA, how Adapters and Prefix-Tuning differ, and how to choose between them in production.

Jun 18, 2025 Transfer Learning 42 min read

Transfer Learning (9): Parameter-Efficient Fine-Tuning

Derive LoRA's low-rank adaptation, the Adapter bottleneck, Prefix-Tuning, Prompt-Tuning, BitFit and QLoRA. Includes a from-scratch LoRA implementation with weight merging and a method-selection guide.

May 7, 2025 Transfer Learning 54 min read

Transfer Learning (2): Pre-training and Fine-tuning

Why pre-training learns a powerful prior from unlabeled data and how fine-tuning adapts it to your task. Covers contrastive learning, masked language models, discriminative learning rates, layer freezing, catastrophic …