When using diffusion models to guide other tasks, you can dramatically reduce compute cost by resampling cheap diffusion noise multiple times per expensive upstream computation, rather than doing one expensive computation per noise sample.
This paper introduces CARV, a framework for reducing variance in gradient estimates when using pretrained diffusion models as teachers in downstream tasks like text-to-3D generation. By reusing expensive computations (like 3D rendering) across multiple noise samples and applying importance sampling techniques, the method achieves 2-3x speedups without changing the underlying objective.