A compact but capable text model that punches above its weight class for a 4B parameter model. Nemotron 3 Nano 4B stands out with an unusually large 262K token context window, letting it handle long documents that would overwhelm similarly-sized peers. Its trade-off is the expected one for small models: complex reasoning and nuanced tasks may hit a ceiling compared to larger siblings.