RUMAD cuts debate token costs with RL

// 129d agoRESEARCH PAPER

RUMAD cuts debate token costs with RL

RUMAD is a research paper on multi-agent debate that replaces fixed or fully connected agent communication with an RL controller that dynamically rewires the debate graph. The result is a much cheaper reasoning setup: over 80% lower token cost than fully connected baselines on MMLU and GSM8K while maintaining or improving accuracy, plus zero-shot transfer from MMLU training to GPQA and GSM8K.

// ANALYSIS

This is a strong paper because it attacks the real bottleneck in multi-agent systems: most debate frameworks waste tokens by treating every agent connection as equally valuable.

–RUMAD uses a PPO-trained controller to adjust edge weights round by round, so agents only exchange information when it is actually useful
–The content-agnostic controller is a smart design choice because it avoids injecting a privileged judge model and keeps coordination separate from reasoning
–The biggest practical win is cost efficiency: 68% on MMLU at 11.4k tokens versus 49% for full MAD at 62.6k, with similarly large savings on GSM8K
–Zero-shot transfer from MMLU to GPQA and GSM8K suggests the learned communication policy is more general than a benchmark-specific prompt hack
–For agent builders, the paper makes a persuasive case that topology control and agent activation matter as much as model choice when scaling multi-agent reasoning

// TAGS

rumadagentreasoningllmresearch

DISCOVERED

129d ago

2026-03-06

PUBLISHED

129d ago

2026-03-06

RELEVANCE

8/ 10

AUTHOR

Discover AI

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE25m ago

OpenDesign integrates Meta Muse Spark API

OpenDesign is an open-source, local-first design workspace that can be paired with Meta's Muse Spark to generate code-ready prototypes and UI screens directly from screenshots and prompts. This integration bridges the gap between visual design and software development, providing developers with an interactive workspace to rapidly iterate on AI-generated user interfaces.

UPDATE25m ago

T3 Code updates agent GUI with git worktrees

T3 Code has updated its local-first GUI for orchestrating AI coding agents, adding multi-provider key and subscription management. The release also introduces native support for git worktrees, custom automation actions, and side-by-side split diffs to safely run multiple agent workflows in parallel.

UPDATE1h ago

Grok Build adds multiline input, scrolling

SpaceXAI has released Grok Build versions 0.2.99 and 0.2.98, introducing multiline input and terminal scrolling for its terminal-based AI coding assistant. The updates allow users to input complex prompts directly on the dashboard and scroll through chat histories using PageUp and PageDown.