Zhipu AI drops 0.9B GLM-OCR for complex document parsing

// 72d agoMODEL RELEASE

Zhipu AI drops 0.9B GLM-OCR for complex document parsing

Zhipu AI has released GLM-OCR, a lightweight 0.9B parameter multimodal model specifically architected for high-efficiency document parsing. Built on the GLM-V/4V framework, it combines a CogViT visual encoder with a GLM decoder and utilizes multi-token prediction to achieve 50% higher throughput than standard models. The model excels at extracting structured Markdown, JSON, and LaTeX from complex tables, mathematical formulas, and handwriting, even in messy real-world scans with stamps or poor lighting.

// ANALYSIS

Zhipu AI is proving that you don't need 70B parameters to solve complex document understanding, delivering a specialized 0.9B model that punches way above its weight class.

–Multi-token prediction (MTP) boosts decoding throughput by ~50% over standard autoregressive models
–Native support for LaTeX and structured JSON makes it a drop-in replacement for expensive proprietary parsing APIs
–Small enough to run on consumer hardware (0.9B parameters) while maintaining SOTA performance on OmniDocBench
–Specialized robustness for "real-world" messy scans, including stamps, seals, and rotated text
–Seamless integration with Ollama and vLLM ensures immediate developer accessibility for local edge deployment

// TAGS

zhipu-aiglm-ocrmultimodalocrllmopen-weightsmarkdowndocument-parsing

DISCOVERED

72d ago

2026-03-17

PUBLISHED

72d ago

2026-03-17

RELEVANCE

8/ 10

AUTHOR

AI Revolution

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

MODEL14m ago

Anthropic drops Opus 4.8 for Claude Code

Anthropic has released Opus 4.8, integrating the new model into Claude Code with high-effort defaults for complex coding tasks. The update boosts SWE-bench Pro scores to 69.2% and drastically reduces unremarked flaws in generated code.

VIDEO14m ago

Google AI animates cardboard TPUs for I/O 2026

Google AI partners with director Laurie Rowan and Nexus Studios to create a promotional short film for Google I/O 2026. The project leverages AI models to animate physical materials like cardboard and markers into characters representing Tensor Processing Units.

MODEL15m ago

Claude Opus 4.8 drops with extended agentic autonomy

Anthropic has released Claude Opus 4.8, bringing improvements to agentic skills, reasoning, and coding capabilities at the exact same price. The update introduces sharper judgment, increased honesty about its task progress, and the ability to operate autonomously for much longer periods.