Tagged
Model Compression
Transfer Learning (5): Knowledge Distillation
Compress large teacher models into small student models without losing much accuracy. Covers dark knowledge, temperature scaling, response-based / feature-based / relation-based distillation, self-distillation, and a …