Models fail at evidence-grounded reasoning not because they lack capacity, but because training data doesn't explicitly teach them that evidence should causally determine their decisions. Constructing supervision that includes controlled negative examples fixes this.
This paper tackles a critical problem in AI: models often ignore evidence when making predictions. The authors introduce a framework that teaches models to genuinely depend on evidence by creating training examples that show when evidence supports a claim and when it doesn't—including tricky cases where evidence seems relevant but actually contradicts the claim.