The number of small text chunks (tokens) a model generates; higher token counts mean longer responses and more computational cost.