Droidrun tops mobile agent benchmark
Droidrun led a 65-task AndroidWorld benchmark at 43% success, ahead of Mobile-Agent (29%), AutoDroid (14%), and AppAgent (7%). The win came with the highest token burn among the stronger agents, underscoring how expensive reliable mobile automation still is.
The headline win matters, but the bigger story is that the best mobile agent still fails most of the time. This is less a category victory lap than a reminder that mobile automation remains brittle and state tracking, recovery, and grounding are the real moat.
- –Droidrun's explicit planning seems to buy reliability, but at a clear token premium.
- –Mobile-Agent looks like the most balanced option if teams want acceptable performance without the top-end spend.
- –AutoDroid is the budget pick, but 14% success is too low for broad deployment.
- –AppAgent's vision-heavy pipeline appears to spend a lot and still miss too much.
- –For developers, the benchmark says mobile agents are promising for narrow workflows, not yet for fully hands-off autonomy.
DISCOVERED
62d ago
2026-03-26
PUBLISHED
62d ago
2026-03-26
RELEVANCE
AUTHOR
No-Speech12