Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection

Mohamad Khajezade, Fatemeh H. Fard, Mohamed Sami Shehata|May 4, 2026arXiv

Key Takeaway

Knowledge distillation from reasoning-optimized models plus response stabilization techniques can make compact open-source models practical for cross-language code clone detection, improving both reliability and inference speed.

Summary

This paper shows how to make smaller, open-source AI models better at detecting when code does the same thing across different programming languages. The researchers use knowledge distillation—teaching a smaller model by learning from a larger reasoning-focused model—combined with techniques to make outputs more reliable and consistent.

efficiency

Key Terms

knowledge-distillation lora reasoning-oriented response-stabilization code-clone-detection