By separating training (lightweight generator) from inference (high-capacity generator), you can build reasoning-driven video models that produce cinema-quality results without prohibitive training costs.
Lumos-Nexus is a video generation system that combines reasoning capabilities with high visual quality by using a lightweight generator during training and progressively handing off to a powerful generator at inference time. This two-stage approach lets models understand user intent and generate coherent videos without the computational cost of training with large generators.