CEV lets you diagnose whether OCR problems come from layout parsing or character recognition itself, helping teams focus improvements where they'll have the most impact on document extraction quality.
This paper introduces the Character Error Vector (CEV), a new metric for evaluating OCR quality that breaks down errors into parsing, OCR, and interaction components. Unlike traditional Character Error Rate, CEV works even when text layout parsing fails, making it practical for real-world document images with complex layouts.