Tagged
Pre-Training
Transfer Learning (2): Pre-training and Fine-tuning
Why pre-training learns a powerful prior from unlabeled data and how fine-tuning adapts it to your task. Covers contrastive learning, masked language models, discriminative learning rates, layer freezing, catastrophic …