BACK_TO_FEEDAICRIER_2
acervo-extractor-qwen3.5-9b lands Q4_K_M GGUF build
OPEN_SOURCE ↗
REDDIT · REDDIT// 10d agoMODEL RELEASE

acervo-extractor-qwen3.5-9b lands Q4_K_M GGUF build

This release packages a fine-tuned Qwen3.5 9B extraction model into GGUF, with Q4_K_M quantization aimed at making structured document extraction practical on local hardware. It also includes a Q8_0 option for users who want a bit more fidelity, alongside benchmark data showing lower footprint with modest quality loss.

// ANALYSIS

Hot take: this is a pragmatic infra-first model release, not a flashy benchmark win, and that’s exactly the point.

  • Q4_K_M cuts the model down to a size that is actually deployable on constrained machines without giving up the structured-extraction specialization.
  • The reported throughput and latency gains are small but real, which matters more than raw perplexity for document pipelines.
  • The tradeoff is sensible: a slight perplexity hit for much lower memory and storage use.
  • The air-gapped use case is the strongest part of the pitch, especially for financial and legal workflows where data locality matters.
// TAGS
qwen3.5ggufquantizationllama-cppstructured-extractioninvoicescontractsfinancial-reportslocal-inferenceprivacy

DISCOVERED

10d ago

2026-04-01

PUBLISHED

10d ago

2026-04-01

RELEVANCE

8/ 10

AUTHOR

gvij