BACK_TO_FEEDAICRIER_2
Droidrun tops mobile agent benchmark
OPEN_SOURCE ↗
REDDIT · REDDIT// 16d agoBENCHMARK RESULT

Droidrun tops mobile agent benchmark

Droidrun led a 65-task AndroidWorld benchmark at 43% success, ahead of Mobile-Agent (29%), AutoDroid (14%), and AppAgent (7%). The win came with the highest token burn among the stronger agents, underscoring how expensive reliable mobile automation still is.

// ANALYSIS

The headline win matters, but the bigger story is that the best mobile agent still fails most of the time. This is less a category victory lap than a reminder that mobile automation remains brittle and state tracking, recovery, and grounding are the real moat.

  • Droidrun's explicit planning seems to buy reliability, but at a clear token premium.
  • Mobile-Agent looks like the most balanced option if teams want acceptable performance without the top-end spend.
  • AutoDroid is the budget pick, but 14% success is too low for broad deployment.
  • AppAgent's vision-heavy pipeline appears to spend a lot and still miss too much.
  • For developers, the benchmark says mobile agents are promising for narrow workflows, not yet for fully hands-off autonomy.
// TAGS
droidrunbenchmarkagentcomputer-useautomationresearch

DISCOVERED

16d ago

2026-03-26

PUBLISHED

17d ago

2026-03-26

RELEVANCE

8/ 10

AUTHOR

No-Speech12