Adaptive binning that adjusts to data skewness can significantly improve decision tree and Random Forest accuracy on skewed real-world data without sacrificing the computational efficiency gains of statistical discretization.
This paper improves how decision trees handle continuous numerical data by introducing Adaptive MSD-Splitting (AMSD), which adjusts binning strategies based on data skewness instead of using fixed cutoffs. The method maintains fast O(N) performance while improving accuracy by 2-4%, and extends to Random Forests for better large-scale performance on real-world datasets.