Steering (modifying activations at inference time) is a fundamentally different adaptation approach from weight updates or prompting—it's reversible, local, and doesn't require retraining, making it a practical alternative for customizing model behavior.
This paper argues that steering—modifying a model's internal activations at inference time—should be understood as a distinct form of model adaptation, comparable to fine-tuning and prompting. The authors develop criteria to compare steering with classical adaptation methods and propose a unified taxonomy showing how steering enables local, reversible behavior changes without updating weights.