We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi @Muennighoff
I am wondering if I can use s1 data to finetune llama-70B model. Have your team tried on 70B before?
The text was updated successfully, but these errors were encountered:
We havn't tried 70B but I'm sure it should work! Btw I recommend using our latest data 1.1: https://hf.co/datasets/simplescaling/s1K-1.1
Sorry, something went wrong.
Hi @Muennighoff if I want to finetune the 70B, is 16 H100 enough if I keep the max_seq_len to 32768
max_seq_len
32768
No branches or pull requests
Hi @Muennighoff
I am wondering if I can use s1 data to finetune llama-70B model. Have your team tried on 70B before?
The text was updated successfully, but these errors were encountered: