Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support qwen2.5-VL in sft.py and solve GRPO deepspeed training issue #110

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

LiuRicky
Copy link
Contributor

@LiuRicky LiuRicky commented Feb 18, 2025

#109

Also solve the problem when using deepspeed zero3 training, as shown in huggingface/transformers@8ee5053

After update these changes, one can add "--deepspeed r1-v/local_scripts/zero3.json " in the training script when using deepspeed.

@LiuRicky LiuRicky changed the title Support qwen2.5-VL in sft.py Support qwen2.5-VL in sft.py and solve GRPO deepspeed training issue Feb 20, 2025
@tzjtatata
Copy link

Hi, thank you for debugging. Can you specify the commit version of the transformers? For me, the current main is at 92c5ca9dd70de3ade2af2eb835c96215cc50e815. Is it as same as your version?

@tzjtatata
Copy link

And I found that the newest version of transformers("92c5ca") has bugs when using Qwen2.5-VL.

@LiuRicky
Copy link
Contributor Author

And I found that the newest version of transformers("92c5ca") has bugs when using Qwen2.5-VL.

I guess it is the version 5 days ago. Maybe 8ee50537fe7613b87881cd043a85971c85e99519 or e3d99ec2f58e0e2a4df6b2b41152fdfb3f92a52f

@tzjtatata
Copy link

tzjtatata commented Feb 23, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants