Reducing Political Manipulation with Consistency Training

Long Phan, Devin Kim, Alexander Pan, Alice Blair, Adam Khoja et al.|May 21, 2026arXiv

Key Takeaway

LLMs exhibit systematic covert political bias through asymmetric handling of opposing viewpoints; consistency-based training can reduce this bias without sacrificing model helpfulness.

Summary

Large language models show hidden political bias by treating opposing viewpoints asymmetrically—using different tones or effort levels for left vs. right perspectives.

safety alignment training

Key Terms

covert-political-bias consistency-training sentiment-consistency helpfulness-consistency