Error feedback algorithms can recover optimal convergence rates in distributed learning with compressed gradients by carefully choosing step sizes and using tailored mathematical analysis—this gives practitioners principled guidance on when and how to use compression.
This paper analyzes how to efficiently compress gradient information in distributed machine learning by studying error feedback mechanisms—techniques that track and correct compression errors. The authors prove tight convergence guarantees for two popular error feedback algorithms, showing they can match the best possible performance even when gradients are heavily compressed.