Researchers introduce HORMA, a Hierarchical Organize-and-Retrieve Memory Agent that organizes working memory into a file-system-like hierarchy to reduce token usage and latency in long-horizon tasks.
HORMA (Hierarchical Organize-and-Retrieve Memory Agent) addresses the challenges LLM agents face in long-horizon tasks, such as context overload and loss of temporal structure. It structures the agent's working memory into a file-system-like workspace where raw interaction trajectories are organized into semantically structured, linked notes using file-system operations. A lightweight retrieval policy trained via reinforcement learning then navigates this hierarchy to extract minimal sufficient context for the current task. Across benchmarks like ALFWorld, LoCoMo, and LongMemEval, HORMA demonstrates superior efficiency-performance trade-offs, reducing token consumption in long conversations to as low as 22% of baseline usage.
Hierarchical workspaces are the future of complex agentic reasoning, moving beyond simple vector similarity search to structured, stateful memory management.
* File-system-like abstraction maps well to how humans organize files and projects, enabling agents to use standard CRUD-like memory operations.
* Using RL to train a navigation policy helps avoid retrieving massive chunks of unnecessary context, directly targeting the latency and cost bottlenecks of long context windows.
* The system constructs memories dynamically using a skill acquisition process, indicating a push towards self-improving agents.
* The 78% reduction in token usage on long conversations makes complex agent deployment significantly more economically viable.
DISCOVERED
2h ago
2026-06-12
PUBLISHED
2h ago
2026-06-12
RELEVANCE
AUTHOR
Discover AI