-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code to evaluate WebArena #13
Comments
Hi, |
During inference, we directly adopt the JSON format output or any format requested in the system prompt. The chat format data is used for training only. |
Thank you very much for the info. We attempted to reproduce the result with the default prompt, but the SR is only 0.61%. Would you mind sharing the recorded trajectories so that we can compare what may go wrong from our end. |
Hello, our project was evaluated in January 2024, and you might need to switch to an earlier official version web-arena-x/webarena@14f91d9. The website's Docker we used was downloaded from the official address https://github.com/web-arena-x/webarena/tree/main/environment_docker#wikipedia-website. |
Hi,
Thanks for the great work. I am wondering if you have plans to release the code to run WebArena?
The text was updated successfully, but these errors were encountered: