You can make test-time adaptation of LLMs practical by combining geometric selection (finding diverse, relevant examples) with gradient reuse on repeated data—achieving better quality with less total runtime.
HullFT speeds up test-time finetuning of language models by using geometry to find relevant training examples and reusing gradients. Instead of expensive per-query selection, it represents each query as a sparse combination of training sequences, then efficiently finetunes on the selected examples with repeated data to save computation.