PEFT
NLP (8): Model Fine-tuning and PEFT
A deep dive into Parameter-Efficient Fine-Tuning. Why LoRA's low-rank update works, the math and memory accounting behind QLoRA, how Adapters and Prefix-Tuning differ, and how to choose between them in production.
Transfer Learning (9): Parameter-Efficient Fine-Tuning
Derive LoRA's low-rank adaptation, the Adapter bottleneck, Prefix-Tuning, Prompt-Tuning, BitFit and QLoRA. Includes a from-scratch LoRA implementation with weight merging and a method-selection guide.
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning adapts frozen LLMs by learning continuous key/value vectors injected into attention. Covers the method, reparameterization, KV-cache mechanics, and comparisons with prompt tuning, adapters, and LoRA.
MoSLoRA: Mixture-of-Subspaces in Low-Rank Adaptation
MoSLoRA boosts LoRA expressivity by mixing multiple low-rank subspaces with a lightweight mixer. Covers when vanilla LoRA fails, mixer design choices, and tuning tips.