For complex reasoning tasks like humor, supervising the intermediate thinking process with structured traces outperforms scaling alone—models need to learn *why* something is funny, not just predict captions.
This paper teaches AI models to understand humor like professional cartoonists by breaking down the reasoning process into three steps: spotting visual mismatches, reinterpreting them creatively, and judging which interpretations are funniest.