LLMs encode rhetorical questions through multiple independent linear directions rather than a single shared representation, meaning the same rhetorical concept can be detected reliably but is understood differently depending on context.
This study uses linear probes to examine how large language models internally represent rhetorical questions—questions asked to persuade rather than seek information. Researchers found that rhetorical signals appear early in model layers and are best captured by final token representations, with good cross-dataset detection (70-80% accuracy).