You can improve cross-lingual knowledge transfer by strategically replacing words in high-resource training data with translations—no parallel data, translation models, or extra training needed.
This paper proposes LINK, a simple method to improve multilingual language models for low-resource languages by swapping English words with their translations during pretraining. The approach requires only a bilingual dictionary and no extra training, yet achieves significant performance gains on downstream tasks across eight languages.