OPEN_SOURCE ↗
REDDIT · REDDIT// 10d agoMODEL RELEASE
acervo-extractor-qwen3.5-9b lands Q4_K_M GGUF build
This release packages a fine-tuned Qwen3.5 9B extraction model into GGUF, with Q4_K_M quantization aimed at making structured document extraction practical on local hardware. It also includes a Q8_0 option for users who want a bit more fidelity, alongside benchmark data showing lower footprint with modest quality loss.
// ANALYSIS
Hot take: this is a pragmatic infra-first model release, not a flashy benchmark win, and that’s exactly the point.
- –Q4_K_M cuts the model down to a size that is actually deployable on constrained machines without giving up the structured-extraction specialization.
- –The reported throughput and latency gains are small but real, which matters more than raw perplexity for document pipelines.
- –The tradeoff is sensible: a slight perplexity hit for much lower memory and storage use.
- –The air-gapped use case is the strongest part of the pitch, especially for financial and legal workflows where data locality matters.
// TAGS
qwen3.5ggufquantizationllama-cppstructured-extractioninvoicescontractsfinancial-reportslocal-inferenceprivacy
DISCOVERED
10d ago
2026-04-01
PUBLISHED
10d ago
2026-04-01
RELEVANCE
8/ 10
AUTHOR
gvij