The process of selectively activating only certain parts of a model for each individual token processed, rather than using the entire network every time.