Nemotron Cascade 2 30B A3B is a hybrid dense-MoE model that activates only 3 billion parameters per forward pass despite having 30 billion total, making it unusually efficient at inference time. It handles extremely long contexts — up to 262K tokens — which suits tasks involving large documents or extended conversations. The trade-off is that sparse activation can sometimes mean less consistent depth than a fully dense model of similar size.