XLMR Across Time

This page hosts the intermediate pretraining checkpoints from the EMNLP 2022 paper Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models.

There are 39 Fairseq model checkpoints (3GB each) that were saved at different training steps of the XLMR-replica pretraining process (which replicates how XLMR-base was trained; details about differences in the pretraining schemes are detailed in our paper). We also provide a Python script to convert checkpoints into the Huggingface format. If you use these resources in your own work, please cite the corresponding paper.

Models

Contact

For any questions or comments about the checkpoints or other aspects of this project, please contact Terra Blevins at blvns@cs.washington.edu.