MinerU2.5 Pro 2604 1.2B

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released April 202633K context≈ 24,576 words1.2B params

A compact 1.2B parameter model from opendatalab that handles both text and image inputs, suggesting multimodal document understanding capabilities. Its naming hints at document parsing or data extraction work, though specific capability details beyond its multimodal input support are limited. It operates within a 32K token context window and is openly available under Apache 2.0.

MinerU2.5 Pro 2604 1.2B

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released April 202633K context≈ 24,576 words1.2B params

A compact 1.2B parameter model from opendatalab that handles both text and image inputs, suggesting multimodal document understanding capabilities. Its naming hints at document parsing or data extraction work, though specific capability details beyond its multimodal input support are limited. It operates within a 32K token context window and is openly available under Apache 2.0.

Glossary

Context WindowThe maximum number of tokens a model can process in a single conversation or prompt.Document ParsingThe process of automatically reading and extracting structured information like text, tables, and layout from documents.Document UnderstandingThe ability to read and extract meaningful information from structured documents like receipts, invoices, and forms by recognizing both text and layout.MultimodalA model that can process and understand multiple types of input, such as both text and images.Multimodal InputThe ability to accept and process multiple types of input data simultaneously, such as both images and text in the same request.Parameter ModelA neural network described by the number of learnable weights it contains; more parameters generally mean greater capacity to learn complex patterns, but also require more computational resources.TokenA small unit of text (a word, subword, or punctuation mark) that a language model breaks input into for processing.

Similar Models

Glossary