LogicEval: A Systematic Framework for Evaluating Automated Repair Techniques for Logical Vulnerabilities in Real-World Software

Syed Md Mukit Rashid, Abdullah Al Ishtiaq, Kai Tu, Yilu Dong, Tianwei Wu et al.|April 14, 2026arXiv

Key Takeaway

LLM-based code repair tools work better than traditional approaches for logical vulnerabilities, but they fail frequently due to sensitivity to how you phrase the request and difficulty understanding the full code context around the bug.

Summary

This paper introduces LogicEval, a framework for testing how well automated repair tools—including AI models—can fix logical vulnerabilities in real software. The authors created LogicDS, a dataset of 86 real security bugs with CVE numbers, and found that current repair techniques struggle mainly because of prompt sensitivity and loss of code context.

safety evaluation applications

Key Terms

logical-vulnerability automated-program-repair prompt-sensitivity patch-localization