AndroidWorld: An Open World for Autonomous Agents

Jonathan Waltz
Marybeth Fair
Daniel Toyama
Will Bishop
Sarah Clinckemaillie
Timothy Lillicrap
Chris Rawles
Robert Berry
Gabrielle Lau
Divya Tyam
Yifan Chang
Alice Li
Folawiyo Campbell-Ajala
Wei Li
ICLR 2025 (2025)

Abstract

Autonomous computer control agents that execute human tasks by controlling user interfaces (UIs) are emerging. Such agents would be valuable for humans, and progress in the field will be driven by realistic and reproducible benchmarks. We present AndroidWorld, a fully-functioning Android environment that pro-vides reward signals across 20 apps on 114 programmatic tasks. Instead of a static test set, the tasks in AndroidWorld parameterized, allowing for unlimited variation in language and task parameters. Reward signals are derived from An-droid system state, making them highly durable and extensible across different applications. To demonstrate AndroidWorld's extensibility, we integrate the popular MiniWoB++ into it.To evaluate AndroidWorld, we introduce a new multimodal autonomous agent for Android, M3A. Our agent achieves a 27% success rate leaving ample room for future work. Furthermore, we adapt a popular desktop web agent for Android, which we find to be less effective on mobile, suggesting future research is needed to build universal, cross-domain agents. Finally, we conduct robustness testing by testing M3A against a suite of real-world variations on a representative subset of tasks. AndroidWorld and the experiments in this paper are available at https://https//github.com/google-research/android-world: