Transfer Learning (11): Cross-Lingual Transfer

Mon, 30 Jun 2025 09:00:00 +0000

English has the labels. The world has 7,000+ languages. Cross-lingual transfer is what lets a sentiment classifier trained only on English IMDB reviews score Spanish tweets, what makes a question-answering model fine-tuned on SQuAD answer Hindi questions, and what allows a model that has never seen a single labeled Swahili sentence to do passable Swahili NER.

This post derives why that is even possible. We start from the bilingual-embedding alignment that motivated the field, walk through the multilingual pretraining recipe (mBERT, XLM-R) that made parallel data optional, and end with the practical playbook — zero-shot vs translate-train vs translate-test, when to pick which, and where the wheels come off.

XLM-R on Chen Kai Blog

Transfer Learning (11): Cross-Lingual Transfer