ColQwen3.5-v2 tops ViDoRe with leaner training
ColQwen3.5-v2 is a 4.5B visual document retrieval model built on Qwen3.5-4B and released under Apache 2.0 on Hugging Face. It reports state-of-the-art ViDoRe V3 nDCG@10 (0.6177) and strong gains from a simplified two-phase pipeline plus model souping with v1.
This is a meaningful retrieval-model iteration: less training complexity, slightly better benchmark outcomes, and a clearer recipe others can reproduce.
- –The v2 recipe cuts phases from 4 to 2 while still improving top-line metrics.
- –Domain-heavy data (finance and tables) being included from the start appears to improve real-world document coverage.
- –The 55/45 soup with v1 suggests practical gains can come from checkpoint engineering, not just bigger base models.
- –Apache 2.0 licensing and published weights make it immediately usable for open retrieval stacks.
DISCOVERED
74d ago
2026-03-14
PUBLISHED
75d ago
2026-03-13
RELEVANCE
AUTHOR
madkimchi