For quick navigation, you can visit AI Agent Marketplace and AI Agent Search Engine to find and list your AI Agent.
Types | Year | AGENT NAME | PAPER | PAPER URL | Github | Website | Demo URL |
---|---|---|---|---|---|---|---|
GUI Agent | 2024 | OS-ATLAS | OS-ATLAS: A Foundation Action Model For Generalist GUI Agents | https://arxiv.org/pdf/2410.23218 | https://github.com/OS-Copilot/OS-Atlas?tab=readme-ov-file | https://osatlas.github.io/ | - |
GUI Agent | 2024 | AutoGLM | AutoGLM: Autonomous Foundation Agents for GUIs | https://arxiv.org/abs/2411.00820 | https://xiao9905.github.io/AutoGLM/ | https://xiao9905.github.io/AutoGLM/ | - |
GUI Agent | 2024 | EDGE | EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data | https://arxiv.org/pdf/2410.19461 | - | - | - |
GUI Agent | 2024 | FERRET-UI-2 | MASTERING UNIVERSAL USER INTERFACE UNDERSTANDING ACROSS PLATFORMS | https://arxiv.org/pdf/2410.18967 | - | - | - |
GUI Agent | 2024 | ShowUI | - | - | One vision-language-action model for generalist gui agent | https://arxiv.org/pdf/2411.17465v1 | https://github.com/showlab/ShowUI |
GUI Agent | 2024 | Tinyclick | - | - | Tinyclick: Single-turn agent for empowering gui automation | https://arxiv.org/pdf/2410.11871 | https://github.com/SamsungLabs/TinyClick |
GUI Agent | 2024 | Openwebvoyager | - | - | Building multimodal web agents via iterative real-world exploration, feedback and optimization. | https://arxiv.org/abs/2410.19609 | - |
GUI Agent | 2024 | OSCAR | OPERATING SYSTEM CONTROL VIA STATE-AWARE REASONING AND RE-PLANNING | https://arxiv.org/pdf/2410.18963 | - | - | - |
GUI Agent | 2024 | PUMA | Large language models empowered personalized web agents. | https://arxiv.org/abs/2410.17236 | - | - | - |
GUI Agent | 2024 | Agentoccam | A simple yet strong baseline for llm-based web agents | https://arxiv.org/abs/2410.13825 | - | - | - |
GUI Agent | 2024 | Agent S | Agent s: An open agentic framework that uses computers like a human. | https://arxiv.org/pdf/2410.13825 | - | - | - |
GUI Agent | 2024 | Click- agent | Enhancing ui location capabilities of autonomous agents. | https://arxiv.org/abs/2410.11872 | - | - | - |
GUI Agent | 2024 | LSFS | From Commands to Prompts: LLM-based Semantic File System for AIOS | https://github.com/agiresearch/AIOS-LSFS | - | - | - |
GUI Agent | 2024 | Naviqate | Functionality-guided web application navigation | https://arxiv.org/abs/2409.10741 | - | - | - |
GUI Agent | 2024 | Periguru | Periguru: A peripheral robotic mobile app operation assistant based on gui image understanding and prompting with llm | https://www.arxiv.org/abs/2409.09354 | - | - | - |
GUI Agent | 2024 | Openwebagent | Openwebagent: An open toolkit to enable web agents on large language models. | https://aclanthology.org/2024.acl-demos.8/ | - | - | - |
GUI Agent | 2024 | LLMCI | Towards llmci-multimodal ai for llm-vision ui operation. | https://assets-eu.researchsquare.com/files/rs-4653823/v1_covered_70334cd2-05bb-4c26-b0d4-73f9aec48ebc.pdf | - | - | - |
GUI Agent | 2024 | Agent-E | From autonomous web navigation to foundational design principles in agentic systems | https://arxiv.org/abs/2407.13032 | - | - | - |
GUI Agent | 2024 | Cradle | Empowering foundation agents towards general computer control | https://arxiv.org/abs/2403.03186 | - | - | - |
GUI Agent | 2024 | CoAT | Android in the zoo: Chain-of-action-thought for gui agents | https://arxiv.org/abs/2403.02713 | - | - | - |
GUI Agent | 2024 | Self-MAP | On the multi-turn instruction following for conversational web agents | https://arxiv.org/abs/2402.15057 | - | - | - |
GUI Agent | 2024 | OS-Copilot | Os-copilot: Towards generalist computer agents with self-improvement | https://arxiv.org/abs/2402.07456 | - | - | - |
GUI Agent | 2024 | Mobile-Agent | Mobile-agent: Autonomous multi-modal mobile device agent with visual perception | https://arxiv.org/abs/2401.16158 | - | - | - |
GUI Agent | 2024 | WebVoyager | Webvoyager: Building an end-to-end web agent with large multimodal models | https://arxiv.org/abs/2401.13919 | - | - | - |
GUI Agent | 2024 | Mobileagent AIA | MOBILEAGENT: ENHANCING MOBILE CONTROL VIA HUMAN-MACHINE INTERACTION AND SOP INTEGRATION | https://arxiv.org/pdf/2401.04124 | - | - | - |
GUI Agent | 2024 | SeeAct | Gpt-4v (ision) is a generalist web agent, if grounded. | https://arxiv.org/abs/2401.01614 | https://github.com/OSU-NLP-Group/SeeAct | - | - |
GUI Agent | 2023 | AppAgent | Appagent: Multimodal agents as smartphone users. | https://arxiv.org/abs/2312.13771 | - | - | - |
GUI Agent | 2023 | ACE | Assistgui: Task-oriented desktop graphical user interface automation | https://arxiv.org/abs/2312.13771 | - | - | - |
GUI Agent | 2023 | MobileGPT | Explore, select, derive, and recall: Augmenting llm with human-like memory for mobile task automation. | https://arxiv.org/abs/2312.03003 | https://mobile-gpt.github.io/ | - | - |
GUI Agent | 2023 | MM-Navigator | Gpt-4v in wonderland: Large multimodal models for zero-shot smartphone gui navigation. | https://arxiv.org/pdf/2311.07562 | - | - | - |
GUI Agent | 2023 | Webwise | Webwise: Web interface control and sequential exploration with large language models. | https://arxiv.org/abs/2310.16042 | - | - | - |
GUI Agent | 2023 | Laser | Laser: Llm agent with state-space exploration for web navigation | https://arxiv.org/abs/2309.08172 | - | - | - |
GUI Agent | 2023 | Synapse | Trajectory-as-exemplar prompting with memory for computer control. | https://arxiv.org/pdf/2306.07863 | - | - | - |
GUI Agent | 2023 | SheetCopilot | Sheetcopilot: Bringing software productivity to the next level through large language models. | https://arxiv.org/abs/2305.19308 | - | - | - |
GUI Agent | 2023 | RCI | Language models can solve computer tasks | https://arxiv.org/abs/2303.17491 | - | - | - |
GUI Agent | 2023 | mobile UIs | Enabling conversational interaction with mobile ui using large language models. | https://arxiv.org/abs/2209.08655 | - | - | - |
Application | AI Agent | Description |
---|---|---|
Clinical Decision Support | IBM Watson Health | Helps analyze medical literature and patient data to provide evidence-based treatment recommendations. |
Clinical Decision Support | Infermedica | Provides diagnostic support and triage for healthcare professionals. |
Clinical Decision Support | Buoy Health | A chatbot for symptom checking and guiding users to appropriate care. |
Virtual Health Assistants | Ada Health | Symptom checker and health assessment app with AI-driven insights. |
Virtual Health Assistants | Babylon Health | Provides symptom checks, virtual consultations, and health monitoring tools. |
Virtual Health Assistants | Sensely | A virtual assistant for managing chronic conditions and improving patient engagement. |
Patient Management | HealthTap | Offers virtual doctor consultations and AI-based symptom checking. |
Patient Management | Ginger | Mental health AI for providing chat-based therapy and support. |
Medical Imaging and Diagnostics | Aidoc | AI-powered imaging analysis for detecting anomalies in radiology scans. |
Medical Imaging and Diagnostics | Zebra Medical Vision | Automates the analysis of medical imaging data for early detection of diseases. |
Medical Imaging and Diagnostics | Arterys | Uses AI for advanced imaging in cardiology, oncology, and other specialties. |
Drug Discovery and Development | Atomwise | AI-driven platform for discovering potential drug compounds. |
Drug Discovery and Development | BenevolentAI | Combines AI and biomedical data to accelerate drug discovery. |
Drug Discovery and Development | DeepMind (AlphaFold) | Revolutionizes protein folding predictions for pharmaceutical research. |
Mental Health and Therapy | Woebot | A chatbot offering cognitive behavioral therapy (CBT) for anxiety and depression. |
Mental Health and Therapy | Replika | Acts as a conversational agent for emotional support and companionship. |
Mental Health and Therapy | Wysa | AI for mental health assistance, focusing on stress and emotional well-being. |
Personalized Medicine | 23andMe AI | Uses genetic data to provide personalized health insights and risk assessments. |
Personalized Medicine | Tempus | Employs AI to personalize cancer treatments based on genetic and molecular data. |
Chronic Disease Management | Livongo | Combines AI with connected devices for managing diabetes, hypertension, and weight loss. |
Chronic Disease Management | Omada Health | AI-based coaching for chronic disease prevention and management. |
Elderly Care and Accessibility | EllieQ by Intuition Robotics | AI companion for reducing loneliness and enhancing quality of life in seniors. |
Elderly Care and Accessibility | Paro Robot | Therapeutic AI robot designed for dementia and Alzheimer’s patients. |
Workflow Optimization | DeepScribe | AI-powered medical scribe to automate clinical documentation. |
Workflow Optimization | Nuance Dragon Medical | Speech recognition AI for documenting patient encounters. |
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use