Series

Transfer Learning

Jul 6, 2025 Transfer Learning 13 min read

Transfer Learning (12): Industrial Applications and Best Practices

Series finale. A field guide to shipping transfer learning to production: when to use it, the end-to-end pipeline, compute and dollar economics, four landmark case studies, A/B testing, distribution-shift monitoring, and …

Jun 30, 2025 Transfer Learning 12 min read

Transfer Learning (11): Cross-Lingual Transfer

Derive cross-lingual transfer from bilingual word-embedding alignment to multilingual pretraining (mBERT, XLM-R). Covers zero-shot transfer, translate-train vs translate-test, pivot strategies, subword anchors, the …

Jun 24, 2025 Transfer Learning 2 min read

Transfer Learning (10): Continual Learning

Derive catastrophic forgetting from gradient interference and the Fisher information matrix. Covers EWC, MAS, LwF, replay (ER/A-GEM), dynamic architectures, the three CL scenarios, FWT/BWT metrics, and a from-scratch EWC …

Jun 18, 2025 Transfer Learning 11 min read

Transfer Learning (9): Parameter-Efficient Fine-Tuning

Derive LoRA's low-rank adaptation, the Adapter bottleneck, Prefix-Tuning, Prompt-Tuning, BitFit and QLoRA. Includes a from-scratch LoRA implementation with weight merging and a method-selection guide.

Jun 12, 2025 Transfer Learning 12 min read

Transfer Learning (8): Multimodal Transfer

Derive contrastive learning (InfoNCE), CLIP's vision-language pretraining, BLIP's Q-Former bridge to LLMs, cross-modal alignment, and multimodal fusion strategies. Includes a from-scratch CLIP implementation in PyTorch.

Jun 6, 2025 Transfer Learning 13 min read

Transfer Learning (7): Zero-Shot Learning

A first-principles tour of zero-shot learning: attribute prototypes (DAP), compatibility functions, DeViSE, generative ZSL with f-CLSWGAN, the GZSL bias problem and calibration, and CLIP-style vision-language …

May 31, 2025 Transfer Learning 20 min read

Transfer Learning (6): Multi-Task Learning

Train one model on multiple tasks simultaneously. Covers hard vs. soft parameter sharing, gradient conflicts (PCGrad, GradNorm, CAGrad), auxiliary task design, and a complete multi-task framework with dynamic weight …

May 25, 2025 Transfer Learning 15 min read

Transfer Learning (5): Knowledge Distillation

Compress large teacher models into small student models without losing much accuracy. Covers dark knowledge, temperature scaling, response-based / feature-based / relation-based distillation, self-distillation, and a …

May 19, 2025 Transfer Learning 15 min read

Transfer Learning (4): Few-Shot Learning

Learn new concepts from a handful of examples. Covers the N-way K-shot protocol, metric learning (Siamese, Prototypical, Matching, Relation networks), meta-learning (MAML, Reptile), episodic training, miniImageNet …

May 13, 2025 Transfer Learning 17 min read

Transfer Learning (3): Domain Adaptation

A practical guide to domain adaptation: covariate shift, label shift, DANN with gradient reversal, MMD alignment, CORAL, self-training, AdaBN, and a complete DANN implementation.

May 7, 2025 Transfer Learning 18 min read

Transfer Learning (2): Pre-training and Fine-tuning

Why pre-training learns a powerful prior from unlabeled data and how fine-tuning adapts it to your task. Covers contrastive learning, masked language models, discriminative learning rates, layer freezing, catastrophic …

May 1, 2025 Transfer Learning 17 min read

Transfer Learning (1): Fundamentals and Core Concepts

A beginner-friendly guide to transfer learning fundamentals: why it works, formal definitions, taxonomy, negative transfer, and a complete feature-transfer implementation with MMD domain adaptation.