When training on decentralized, non-uniform data, use Masked Image Modeling instead of Contrastive Learning—it's theoretically more robust. Better network connectivity always improves robustness, so federated learning is a viable alternative to fully decentralized systems.
This paper analyzes how distributed self-supervised learning systems handle non-uniform data across devices. The researchers prove that Masked Image Modeling is more robust to data heterogeneity than Contrastive Learning, and that federated learning performs as well as fully decentralized approaches. They introduce MAR loss, a practical improvement that aligns local and global representations.