Vision-Language Encoder — Glossary — ThinkLLM