Do Vision-Language Models Truly Perform Vision Reasoning? A Rigorous Study of the Modality Gap — ThinkLLM