You can boost reasoning model performance on new domains without labeled data by synthesizing diverse question variants at test time and using hybrid exploration to balance accuracy with consistency across variants.
TTVS helps AI reasoning models improve themselves at test time by creating diverse variations of unlabeled questions and learning from them. Instead of relying on expensive labeled data, the system generates synthetic question variants and uses exploration strategies to learn the underlying problem logic rather than memorizing surface patterns.