Add example of RL training with Axolotl #162

andreaskoepf · 2025-02-19T05:51:21Z

Axolotl v0.7.0 is has been released with support for GRPO. See tweet.

Wing shared this GSM-8K example: https://github.com/axolotl-ai-cloud/axolotl-cookbook/blob/main/grpo/gsm8k_grpo.py

The GRPO support in Axolotl is based on trl (afaik) but nevertheless it would be great to have a separate example because Axolotl is a popular training tool which integrates many different techniques (and makes them easily usable).

andreaskoepf closed this as completed Feb 22, 2025

andreaskoepf reopened this Feb 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example of RL training with Axolotl #162

Add example of RL training with Axolotl #162

andreaskoepf commented Feb 19, 2025

Add example of RL training with Axolotl #162

Add example of RL training with Axolotl #162

Comments

andreaskoepf commented Feb 19, 2025