You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The GRPO support in Axolotl is based on trl (afaik) but nevertheless it would be great to have a separate example because Axolotl is a popular training tool which integrates many different techniques (and makes them easily usable).
The text was updated successfully, but these errors were encountered:
Axolotl v0.7.0 is has been released with support for GRPO. See tweet.
Wing shared this GSM-8K example: https://github.com/axolotl-ai-cloud/axolotl-cookbook/blob/main/grpo/gsm8k_grpo.py
The GRPO support in Axolotl is based on trl (afaik) but nevertheless it would be great to have a separate example because Axolotl is a popular training tool which integrates many different techniques (and makes them easily usable).
The text was updated successfully, but these errors were encountered: