Image Tokenization

architecture

The process of converting images into discrete tokens (small units) that a language model can process, similar to how it handles text.

Related Capabilities

Quality of vision, audio, and image understanding (distinct from modality support)