Skip to content

Commit

Permalink
Update update_paper_list.md
Browse files Browse the repository at this point in the history
  • Loading branch information
boyugou authored Jan 4, 2025
1 parent 87d18ea commit eaa7abc
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion update_template_or_data/update_paper_list.md
Original file line number Diff line number Diff line change
Expand Up @@ -445,7 +445,7 @@
- 📅 Date: October 7, 2024
- 📑 Publisher: arXiv
- 💻 Env: [GUI]
- 🔑 Key: [framework], [visual grounding], [GUI agents], [cross-platform generalization], [UGround], [SeeAct-V], [synthetic data]
- 🔑 Key: [framework], [model], [dataset], [visual grounding], [GUI agents], [cross-platform generalization], [UGround], [SeeAct-V], [synthetic data]
- 📖 TLDR: This paper introduces UGround, a universal visual grounding model for GUI agents that enables human-like navigation of digital interfaces. The authors advocate for GUI agents with human-like embodiment that perceive the environment entirely visually and take pixel-level actions. UGround is trained on a large-scale synthetic dataset of 10M GUI elements across 1.3M screenshots. Evaluated on six benchmarks spanning grounding, offline, and online agent tasks, UGround significantly outperforms existing visual grounding models by up to 20% absolute. Agents using UGround achieve comparable or better performance than state-of-the-art agents that rely on additional textual input, demonstrating the feasibility of vision-only GUI agents.

- [ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning](https://agent-e3.github.io/ExACT/)
Expand Down

0 comments on commit eaa7abc

Please sign in to comment.