You can compress state-of-the-art vision transformers into tiny, CPU-deployable models for real-time fire detection without sacrificing accuracy by using uncertainty-aware, multi-stage knowledge distillation with spatial attention guidance.
HumP-KD is a knowledge distillation framework that compresses large transformer models (Swin-Tiny, ViT-Base) into a lightweight MobileViT-S student for real-time fire detection.