Droidrun tops mobile agent benchmark

// 107d agoBENCHMARK RESULT

Droidrun tops mobile agent benchmark

Droidrun led a 65-task AndroidWorld benchmark at 43% success, ahead of Mobile-Agent (29%), AutoDroid (14%), and AppAgent (7%). The win came with the highest token burn among the stronger agents, underscoring how expensive reliable mobile automation still is.

// ANALYSIS

The headline win matters, but the bigger story is that the best mobile agent still fails most of the time. This is less a category victory lap than a reminder that mobile automation remains brittle and state tracking, recovery, and grounding are the real moat.

–Droidrun's explicit planning seems to buy reliability, but at a clear token premium.
–Mobile-Agent looks like the most balanced option if teams want acceptable performance without the top-end spend.
–AutoDroid is the budget pick, but 14% success is too low for broad deployment.
–AppAgent's vision-heavy pipeline appears to spend a lot and still miss too much.
–For developers, the benchmark says mobile agents are promising for narrow workflows, not yet for fully hands-off autonomy.

// TAGS

droidrunbenchmarkagentcomputer-useautomationresearch

DISCOVERED

107d ago

2026-03-26

PUBLISHED

108d ago

2026-03-26

RELEVANCE

8/ 10

AUTHOR

No-Speech12

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS27m ago

OpenServ targets banking sector with SERV reasoning engine

OpenServ has announced its strategic vision for 2026, focusing on bringing its SERV reasoning engine into the world's largest enterprise markets, starting with the banking sector. The company aims to make its reasoning technology the new industry standard for financial institutions.

NEWS32m ago

OpenAI faces backlash over reduced GPT-5.6 limits

Users on X are raising questions after reports emerged that OpenAI engineers halved inference costs, while simultaneously experiencing reduced usage limits for GPT-5.6. The community is confused by this apparent contradiction, as lowering usage limits effectively makes inference more costly for users, prompting speculation about whether the initial cost-reduction news was accurate or if there are other operational factors at play.

UPDATE2h ago

Lightpanda merges IndexedDB support for automation

Lightpanda, the open-source headless browser engine written in Zig for web automation and AI agents, has added base implementation support for IndexedDB to its main branch. This update allows scripts that depend on IndexedDB for client-side storage to execute successfully, removing a significant barrier for automation and scraping workflows on modern web applications.