A generative model that creates videos by iteratively refining noise into coherent frames, similar to image diffusion but applied to sequences.