Awesome GUI Compute Agents

A curated list of GUI (Graphical User Interface) compute agents - AI systems that can see, understand, and interact with graphical interfaces like humans do.

This project is maintained by Parni and Ian. Follow Supernal Intelligence for more updates.

Website: supernalintelligence.com
Join our Discord: Supernal Intelligence Discord

For more complete data and the latest information, please visit our website: supernalintelligence.com

What are GUI Compute Agents?

GUI compute agents are AI systems designed to interact with graphical user interfaces just like humans do. They can:

See and understand screen elements
Click buttons, type text, and drag elements
Navigate through applications and websites
Complete complex visual workflows
Automate GUI-based tasks through natural language instructions

Commercial Agents

Name	Developer	Status	Key Features	Environment
Ace	General Agents	Upcoming (2025)	Achieved 20× human speed on UI tasks; controls full computer via screen pixels	Desktop, Browser
ACT-1	Adept AI	Released (2022)	Pioneer in digital actions; self-correcting behavior	Desktop, Browser
CloudCruise	CloudCruise	Released	Cloud-based GUI automation; enterprise-grade	Cloud, Browser
Felluo AI	Felluo	Released	Vision-based GUI automation; supports both browser and desktop interactions	Desktop, Browser
Adaptive.AI	Adaptive AI Inc	Released	AI risk management framework; technology strategy consulting	Browser
AgentGPT	Reworkd	Released (2023)	User-friendly interface for creating goal-oriented agents	Browser
AI Agent Studio	Automation Anywhere	Released (2025)	Handles structured and unstructured data; creates AI agents for enterprise automation	Browser
Apple Intelligence Agents	Apple	Upcoming (2025)	Deep OS integration; privacy-focused	Phone, Desktop
AskUI Vision Agent	AskUI	Released	Cross-platform functionality without virtual machines	Desktop, Browser, Phone
Beam AI	Beam	Released	Agentic Process Automation platform for customer support, onboarding, sales proposal generation	Browser
Claude Agent Kit	Anthropic	Upcoming (2024)	Official toolkit for building Claude-powered agents	Browser
Claude Computer Use	Anthropic	Released (2024)	Works on desktop apps and browsers; AI model-based approach	Browser, Desktop, Multi-device
Devin	Cognition Labs	Upcoming (2025)	Full-stack programming capabilities with browser access	Desktop, Browser
Fuyu-Heavy	Adept AI	Released (2024)	Ranked 3rd best vision-action model behind GPT-4V and Gemini Ultra	Desktop, Browser
Gemini 1.5 Pro (Tool Use)	Google	Released	Long context, tool orchestration in Workspace	Browser
Google Mariner	Google DeepMind	Unreleased	High WebVoyager benchmark performance	Browser
Gumloop	Gumloop	Released (2023)	Visual workflow canvas; 90+ pre-built templates; Chrome extension for web automation	Browser
Highlight AI	Embedded Intelligence	Released (2024)	Instant Q&A and automation on desktop; strong privacy focus	Desktop, Browser
Hyperbrowser	Hyperbrowser.ai (YC Backed)	Released (2024)	Sub-second browser launch, 10,000+ concurrent browsers, CAPTCHA solving	Browser
Lindy	Lindy.ai	Released	Virtual AI assistant for daily business tasks	Browser
Manus	Monica AI (China)	Released (2024)	World's first general AI agent; SOTA on GAIA benchmark	Desktop, Browser, Phone
MultiOn (now Please AI)	Please AI	Released (2023)	Multi-step web tasks end-to-end; preference learning	Browser
OpenAI CUA (Operator)	OpenAI	Released (2025)	High benchmark performance; uses reasoning models tech	Browser, Desktop
Perplexity Comet	Perplexity AI	Upcoming (2025)	Autonomous multi-step search with citations	Browser
Project Jarvis	Google	Rumored	Computer-using agent system; few details available	Desktop, Browser
Proxy	Convergence AI	Released (2025)	Handles concurrent sub-tasks; cheaper alternative to Operator	Browser
Relay	Relay.app	Released (2021)	Clean, simple interface; extensive app integrations	Browser
Relevance AI	Relevance AI	Released	Drag-and-drop skill building, templates, integrations	Browser
ServiceNow AI Agents	ServiceNow	Released	Built-in governance, analytics, text-to-action capabilities	Browser
Vy	Vercept	Released (2025)	Advanced human-computer interaction; works with existing applications	Desktop

Open Source Agents

Name	Developer	License	Key Features	Environment
Agent S	Simular AI	Research License	Web research, content summarization, data extraction	Browser, Desktop
Agent S2	Simular AI	Research License	OSWorld: 34.5%; AndroidWorld: 50%; outperforms OpenAI CUA/Operator	Browser, Desktop, Phone
AutoGen	Microsoft	MIT	Agents can converse with each other to solve tasks	Browser
AutoGPT	Significant Gravitas	MIT	Pioneer in autonomous GPT agents; self-prompting with memory	Browser
BabyAGI	Yohei Nakajima	MIT	Autonomous task creation and prioritization	Browser
Browser Use	Y Combinator/ETH Zurich	Proprietary	Makes websites more digestible for AI agents	Browser
c/ua (Computer-Use Agent)	TryCua	Open Source	High-performance virtualization; fully isolated virtual environments	Desktop, Virtual Machine
CogAgent	Tsinghua Univ. & Zhipu	Research License (CC BY-NC)	High-performance open model rivaling closed models	Desktop, Browser
CrewAI	CrewAI	Proprietary	Enables orchestration of specialized agents in teams	Browser
HyperAgent	FSoft-AI4Code	Apache 2.0	Handles GitHub issue resolution, repository-level code generation	Browser, Desktop
LangGraph	LangChain	MIT	Framework for building stateful, multi-agent systems	Browser
LLM Agents	NVIDIA/Meta	Research License	Standardized evaluation for LLM agents	Browser
Octo	Google DeepMind	Apache 2.0	Zero-shot generalization to new objects and tasks	Physical World
OpenInterpreter	Open Interpreter	Proprietary	Code interpreter for local execution	Desktop, Browser
OWL	Camel-AI	Proprietary	Distributed task automation	Browser
RooCode	Open-source	Proprietary	Autonomous coding in VS Code	Browser, Desktop
Simular AI	Simular	Research License	SOTA on OSWorld and AndroidWorld benchmarks	Desktop, Browser, Phone
Suna	Kortix	Proprietary	Highly versatile generalist agent; handles complex tasks	Browser
UI-TARS	ByteDance/TikTok	Research License	Autonomous GUI execution on PC/Mac/Android	Browser, Desktop, Phone
Vercel AI SDK Computer Use	Vercel	Open Source	Standardized API for different AI models; streaming capabilities	Browser, Web
WebVoyager	Hongliang He et al.	Research License	59.1% success on 15-website benchmark	Browser
Felluo AI	Felluo	Proprietary	Vision-based GUI automation	Browser, Desktop

Research Projects

Name	Institution	Focus Area	Release Date
Deep Research Agent	OpenAI	Web browsing, research	2024
Gato	Google DeepMind	Multi-modal, multi-task, multi-embodiment	2022
HuggingGPT (Jarvis)	Microsoft	Orchestrates specialists for multi-modal tasks	2023
I-AFM	Microsoft Research	Multi-modal, multi-task system	2024
Magma	Microsoft Research	Vision-language-action model	2025
mlejva's Computer Agent	Vasek Mlejnsky	GUI interaction	2024
PaLM-E	Google DeepMind & Robotics at Google	Embodied multimodal language model	2023
RT-2	Google DeepMind	Vision-language-action model	2023
SayCan	Google	Grounded language model for robotics	2022
SIMA	Google DeepMind	3D virtual environments	2024
WebAgent	Google DeepMind	Autonomous web browsing and form-filling	2024

By Environment

Browser-Based Agents

Browser-based agents specialize in navigating and interacting with web interfaces:

Name	Developer	Status	Development Type
Hyperbrowser	Hyperbrowser.ai (YC Backed)	Released	Commercial
Perplexity Comet	Perplexity AI	Upcoming	Commercial
Browser Use	Y Combinator/ETH Zurich	Released	Commercial, Open-source
CloudCruise	CloudCruise	Released	Commercial
Deep Research Agent	OpenAI	Unreleased	Commercial, Research
Felluo AI	Felluo	Released	Commercial
Google Mariner	Google DeepMind	Unreleased	Commercial, Research
Gumloop	Gumloop	Released	Commercial
MultiOn (now Please AI)	Please AI	Released	Commercial
Proxy	Convergence AI	Released	Commercial
Suna	Kortix	Released	Open-source
WebVoyager	Hongliang He et al.	Released	Research, Open-source

Desktop Agents

Desktop agents interact with operating system GUIs and desktop applications:

Name	Developer	Status	Development Type
Ace	General Agents	Upcoming	Commercial, Research
Claude Computer Use	Anthropic	Released	Commercial
Felluo AI	Felluo	Released	Commercial
Fuyu-Heavy	Adept AI	Released	Commercial, Research
Highlight AI	Embedded Intelligence	Released	Commercial
OpenAI CUA (Operator)	OpenAI	Released	Commercial
Project Jarvis	Google	Rumored	Commercial, Research
CogAgent	Tsinghua Univ. & Zhipu	Released	Research, Open-source
Vy	Vercept	Released	Commercial
c/ua (Computer-Use Agent)	TryCua	Released	Open Source

Physical World Agents

These agents operate in 3D environments, games, and physical systems:

Name	Developer	Status	Development Type
Gato	Google DeepMind	Released	Research
I-AFM	Microsoft Research	Released	Research
Magma	Microsoft Research	Released	Research, Open Source
Octo	Google DeepMind	Released	Open Source, Research
PaLM-E	Google DeepMind & Robotics at Google	Released	Research
RT-2	Google DeepMind	Released	Research
SayCan	Google	Released	Research
SIMA	Google DeepMind	Released	Research

Cloud Agents

Cloud-based agents running in remote environments:

Name	Developer	Status	Development Type
CloudCruise	CloudCruise	Released	Commercial

Multi-Device Agents

Agents that can operate across multiple device types:

Name	Developer	Status	Supported Devices
Agent S2	Simular AI	Released	Windows, MacOS, Linux, Android, iOS
AskUI Vision Agent	AskUI	Released	Windows, MacOS, Linux, Android, iOS
Claude Computer Use	Anthropic	Released	Windows, MacOS, Linux, Multi-device
Manus	Monica AI (China)	Released	Windows, MacOS, Linux, Android, iOS
Simular AI	Simular	Released	Windows, MacOS, Linux, Android, iOS
UI-TARS	ByteDance/TikTok	Released	Windows, MacOS, Linux, Android, iOS

By Task Complexity

For a full breakdown of agents by task complexity, including Single Workflow, Multiple Workflow, and Complex Workflow Agents, please visit our website: supernalintelligence.com

Resources

Communities

Supernal Intelligence Discord - Join our community to discuss GUI agents, share resources, and connect with others
X/Twitter: @supernalasi - Follow for updates and news about GUI agents and AI advancements
Website: supernalintelligence.com - Official website with more resources and information

Related Awesome Lists

Awesome AI Agent Leaderboards - Comprehensive list of leaderboards for AI agents
Awesome AI Agent Benchmarks - Comprehensive list of benchmarks for AI agents

Contribution

Contributions welcome! Please read the contribution guidelines first or email i@supernal.ai if you see an error or want to contribute.

License

This awesome list is maintained by Parni and Ian, and is released under the MIT Open Source License.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md
assets		assets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome GUI Compute Agents

What are GUI Compute Agents?

Contents

Commercial Agents

Open Source Agents

Research Projects

By Environment

Browser-Based Agents

Desktop Agents

Physical World Agents

Cloud Agents

Multi-Device Agents

By Task Complexity

Resources

Communities

Related Awesome Lists

Contribution

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome GUI Compute Agents

What are GUI Compute Agents?

Contents

Commercial Agents

Open Source Agents

Research Projects

By Environment

Browser-Based Agents

Desktop Agents

Physical World Agents

Cloud Agents

Multi-Device Agents

By Task Complexity

Resources

Communities

Related Awesome Lists

Contribution

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages