Statistically significant findings from keyword-based text analysis can be entirely artifacts of the measurement method rather than real phenomena. Always validate keyword-based results with semantic approaches before drawing conclusions about speaker psychology or discourse patterns.
This paper reveals how keyword-based measurement tools can produce false findings in computational social science. By comparing keyword counting to LLM-based semantic analysis of interviews, the authors show that a strong statistical correlation between negative affect and certainty disappears—and even reverses—when using more accurate measurement.