The core neural network component that processes and understands images before passing information to the rest of the model.
Quality of vision, audio, and image understanding (distinct from modality support)