Evaluating harmful content detection requires multi-layered reasoning beyond surface-level classification; models need to explain their decisions and understand implicit harms, not just flag obvious ones.
HarmVideoBench is a benchmark for evaluating how well AI models understand harmful content in videos. Unlike existing tests that just ask yes/no questions, this benchmark uses 1,379 videos with 4,137 multiple-choice questions across three difficulty levels—from spotting obvious harmful elements to reasoning about context beyond what's shown.