OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoNEWS
Production RAG Hits Three Failure Modes
A Reddit post from a legal-domain RAG operator says the system works for most queries but fails predictably on scattered multi-document questions, clean abstention, and time-sensitive comparisons. The ask is for production tactics that improve retrieval without rebuilding the whole stack.
// ANALYSIS
This reads like the real gap between demo-grade RAG and production QA: the hard problems are not answer generation, they’re retrieval control, abstention, and temporal scoping.
- –Scatter usually needs structure, not just larger `k`: hybrid retrieval, query routing, and graph or facet-based expansion tend to outperform raw vector top-k on broad comparison tasks
- –Negative knowledge needs an explicit reject path: if the retriever can’t support answerability, the system should abstain before the LLM ever gets a chance to improvise
- –Temporal questions are often two retrieval problems, not one: split pre/post retrieval, then force the generator to compare evidence across the boundary
- –GraphRAG-style indexing may help on the scatter case, but it is not a blanket fix for uncertainty or chronology
- –The production pattern that seems most robust is an answerability gate plus multiple targeted retrieval passes, not a single prompt with better instructions
// TAGS
ragllmsearchprompt-engineering
DISCOVERED
4h ago
2026-04-27
PUBLISHED
7h ago
2026-04-27
RELEVANCE
9/ 10
AUTHOR
Fabulous-Pea-5366