Contextual Linear Activation Steering of Language Models

Brandon Hsu, Daniel Beaglehole, Adityanarayanan Radhakrishnan, Mikhail Belkin|April 27, 2026arXiv

Key Takeaway

Adapting steering strength dynamically per context significantly improves LLM control compared to fixed steering, matching more complex methods like LoRA while remaining simpler and more interpretable.

Summary

This paper improves linear activation steering—a technique for controlling LLM behavior—by making the steering strength adapt to each input context instead of using a fixed strength for all tokens. The method, called CLAS, works better than existing approaches across multiple benchmarks and models, offering a practical way to customize LLMs with limited training data.

alignment efficiency training

Key Terms

activation-steering linear-activation-steering lora contextual-adaptation