From 7edb7d4ac7d69b21b5d1538202bb761c48a1f7e1 Mon Sep 17 00:00:00 2001 From: Boyu Gou <103808989+boyugou@users.noreply.github.com> Date: Fri, 20 Dec 2024 08:19:29 -0500 Subject: [PATCH] Update update_paper_list.md --- update_template_or_data/update_paper_list.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/update_template_or_data/update_paper_list.md b/update_template_or_data/update_paper_list.md index fc5a9e6..4f69394 100644 --- a/update_template_or_data/update_paper_list.md +++ b/update_template_or_data/update_paper_list.md @@ -7,6 +7,18 @@ - 🔑 Key: [survey] - 📖 TLDR: This survey provides a comprehensive overview of GUI agents powered by Large Foundation Models, detailing their benchmarks, evaluation metrics, architectures, and training methods. It introduces a unified framework outlining their perception, reasoning, planning, and acting capabilities, identifies open challenges, and discusses future research directions, serving as a resource for both practitioners and researchers in the field. + +- [OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use](https://github.com/OS-Agent-Survey/OS-Agent-Survey/blob/main/paper.pdf) + - Xueyu Hu, Tao Xiong, Biao Yi, Zishu Wei, Ruixuan Xiao, Yurun Chen, Jiasheng Ye, Meiling Tao, Xiangxin Zhou, Ziyu Zhao, Yuhuai Li, Shengze Xu, Shawn Wang, Xinchen Xu, Shuofei Qiao , Kun Kuang, Tieyong Zeng, Liang Wang, Jiwei Li, Yuchen Eleanor Jiang, Wangchunshu Zhou, Guoyin Wang, Keting Yin, Zhou Zhao, Hongxia Yang, Fan Wu, Shengyu Zhang, Fei Wu + - 🏛️ Institutions: Zhejiang University, Fudan University, OPPO AI Center, University of Chinese Academy of Sciences, Institute of Automation, Chinese Academy of Sciences, The Chinese University of Hong Kong, Tsinghua University, 01.AI, The Hong Kong Polytechnic University, Shanghai Jiao Tong University, + - 📅 Date: December 20, 2024 + - 📑 Publisher: https://os-agent-survey.github.io/ + - 💻 Env: [GUI] + - 🔑 Key: [survey] + - 📖 TLDR: This survey aims to advance the research and development of OS Agents by providing a detailed exploration of their fundamental capabilities, methodologies for building them using (M)LLMs, and emerging trends in the field. While OS Agents are still in the early stages of growth, the rapid evolution of technology continues to introduce innovative approaches and applications. This work seeks to highlight ongoing challenges, future opportunities, and the latest developments, encouraging further research and industrial adoption. Ultimately, we hope this study will serve as a catalyst for innovation, driving meaningful progress in both academia and industry. + + + - [Falcon-UI: Understanding GUI Before Following User Instructions](https://arxiv.org/abs/2412.09362) - Huawen Shen, Chang Liu, Gengluo Li, Xinlong Wang, Yu Zhou, Can Ma, Xiangyang Ji - 🏛️ Institutions: Chinese Academy of Sciences, Tsinghua University, Nankai University, BAAI