OPEN_SOURCE ↗
REDDIT · REDDIT// 7h agoRESEARCH PAPER
DharmaOCR tops OCR bench, cuts cost
DharmaOCR Full and Lite are 7B and 3B structured-OCR models from Dharma-AI built with SFT plus DPO. The paper says they beat commercial OCR systems and open-source baselines on a new benchmark while reducing degeneration and per-page inference cost.
// ANALYSIS
This is a strong reminder that specialization can beat bigger general-purpose models when the task has a rigid output format and a measurable failure mode.
- –DPO here is not just alignment theater; using degenerate generations as rejected samples directly targets the looping and runaway outputs that hurt OCR pipelines.
- –The reported scores, 0.925 for the 7B model and 0.911 for the 3B model, are impressive, but they are still benchmark-specific, so the claim is strongest for structured OCR rather than broad document understanding.
- –AWQ cutting per-page cost by about 22% with negligible quality loss is the part that matters operationally, because OCR workloads are usually judged on throughput and unit economics as much as accuracy.
- –The comparison set is broad, spanning commercial APIs and open-source OCR stacks, which makes the result more interesting than a narrow internal eval.
// TAGS
dharmaocrllmfine-tuningopen-sourcebenchmarkinference
DISCOVERED
7h ago
2026-04-17
PUBLISHED
8h ago
2026-04-17
RELEVANCE
9/ 10
AUTHOR
Flat_Divide9839