You can now build random forests on sensitive data with differential privacy that actually work well in practice—Lumberjack's smart pruning strategy significantly closes the gap between private and non-private model performance.
Lumberjack is a differentially private random forest algorithm that builds large decision trees and then prunes them intelligently to protect sensitive data. By using a novel heavy hitter detection method, it can use deeper trees than previous approaches while maintaining privacy guarantees, achieving much better accuracy on real datasets.