Knowledge distillation from reasoning-optimized models plus response stabilization techniques can make compact open-source models practical for cross-language code clone detection, improving both reliability and inference speed.
This paper shows how to make smaller, open-source AI models better at detecting when code does the same thing across different programming languages. The researchers use knowledge distillation—teaching a smaller model by learning from a larger reasoning-focused model—combined with techniques to make outputs more reliable and consistent.