You can estimate a wide MLP's expected output more efficiently than sampling by directly computing activation distributions layer-by-layer using mathematical tools, which is particularly useful for detecting tail risks.
This paper presents a mathematical method to estimate what a randomly initialized neural network will output on average, without actually running data through it. Instead of sampling (the standard approach), the authors use statistical tools like cumulants and Hermite expansions to track how activations behave at each layer.