You can add adaptive learning to existing LLMs at inference time by updating just the MLP projection layers with a specialized objective, making models smarter on-the-fly without retraining.
This paper introduces In-Place Test-Time Training, a method that lets language models update their weights during inference to adapt to new information. Instead of retraining entire models, it efficiently updates just the final projection layers of MLP blocks using a next-token-prediction objective, enabling models to handle longer contexts and improve performance without expensive retraining.