PEFT

Nov 5, 2025 NLP 18 min read

NLP (8): Model Fine-tuning and PEFT

A deep dive into Parameter-Efficient Fine-Tuning. Why LoRA's low-rank update works, the math and memory accounting behind QLoRA, how Adapters and Prefix-Tuning differ, and how to choose between them in production.

Jul 29, 2025 Standalone 22 min read

Prefix-Tuning: Optimizing Continuous Prompts for Generation

Prefix-Tuning adapts frozen LLMs by learning continuous key/value vectors injected into attention. Covers the method, reparameterization, KV-cache mechanics, and comparisons with prompt tuning, adapters, and LoRA.

Jun 18, 2025 Transfer Learning 42 min read

Transfer Learning (9): Parameter-Efficient Fine-Tuning

Derive LoRA's low-rank adaptation, the Adapter bottleneck, Prefix-Tuning, Prompt-Tuning, BitFit and QLoRA. Includes a from-scratch LoRA implementation with weight merging and a method-selection guide.

Sep 1, 2024 Standalone 26 min read

MoSLoRA: Mixture-of-Subspaces in Low-Rank Adaptation

MoSLoRA boosts LoRA expressivity by mixing multiple low-rank subspaces with a lightweight mixer. Covers when vanilla LoRA fails, mixer design choices, and tuning tips.