
RecMem cuts agent memory costs by 87%
RecMem is a three-tier memory management framework for LLM agents that optimizes long-term memory construction through a subconscious buffer and recurrence detection. By deferring expensive LLM-based consolidation until significant semantic patterns emerge, it reduces token costs by 87% while maintaining high performance on benchmarks.
RecMem tackles the "eager consolidation" bottleneck by treating agent memory like a human-like multi-store system.
- –Reduces memory construction token costs by 8.7x compared to current state-of-the-art systems.
- –Employs a three-tier architecture: Subconscious (lightweight embeddings), Episodic (narrative summaries), and Semantic (fact recovery).
- –Sustained recurrence detection ensures only meaningful information triggers expensive LLM summarization.
- –Outperforms existing systems on LoCoMo and LongMemEval-S benchmarks, proving that less frequent consolidation can be more effective.
- –The open-source implementation provides a modular framework for developers to swap embedding and LLM backends.
DISCOVERED
7h ago
2026-05-20
PUBLISHED
7h ago
2026-05-20
RELEVANCE
AUTHOR
Discover AI