Add `run_orpo.py` #143

alvarobartt · 2024-03-26T17:53:19Z

Description

This PR adds the run_orpo.py Python script to fine-tune LLMs with the "to be released" trl.ORPOTrainer.

Besides that, some changes have been applied in the dataset formatting, to also support DPO/ORPO datasets formatted as prompt-chosen-rejected, and adding the orpo as a task in apply_chat_template.

Additionally, this PR adds the prompt filtering based on the length if provided among the model_args similarly to what's done in the official ORPO codebase for consistency when replicating their experiments.

Experiments

A raw version of the script has been ran, but more tests are needed, if there's an interesting use case I'm happy to collaborate for the release of run_orpo.py as recently done for both Zephyr Gemma #129 and StarChat 2 #135 🤗

Mistral-7B-v0.1 fine-tune with `argilla/distilabel-capybara-dpo-7k-binarized` as in https://huggingface.co/kaist-ai/mistral-orpo-capybara-7k

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml --num_processes 4 scripts/run_orpo.py recipes/mistral-capybara/orpo/config_full.yaml

HuggingFaceDocBuilderDev · 2024-03-26T17:57:13Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Done similarly to the original implementation, in order to better reproduce their results

nisten · 2024-04-12T10:30:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

The docs no longer live here :(

alvarobartt · 2024-04-14T17:21:49Z

The docs no longer live here :(

Yes, I believe that's expected! Anyway, thanks for mentioning, I'll submit a PR to add the documentation for ORPO, since it's missing now 👍🏻

alvarobartt added 4 commits March 26, 2024 18:02

Add ORPOConfig

34e9b2f

Add task=orpo and support (prompt,chosen,rejected) datasets

bc4dceb

Add missing model_init_kwargs and dataset_num_proc

9108fce

Add run_orpo.py (WIP)

97d3dcc

alvarobartt added 9 commits March 27, 2024 13:13

Update trl dependency from source

38e15e4

Add setup_chat_format before apply_chat_template

9c25c10

Add config_full.yaml for mistral-7b-orpo

b4ed9f7

Fix comment indentation

92f25a8

Use chat_template=chatml instead

ab3e6a9

Add kaist-ai/mistral-orpo-capybara-7k recipe

ab96e28

Rename DPOTrainer to ORPOTrainer in config_full.yaml files

05c187f

Run black --line-length 119 src

c89acca

Add is_openai_format to fix (prompt,chosen,rejected) formatting

b229753

alvarobartt marked this pull request as ready for review March 27, 2024 12:48

alvarobartt added 3 commits March 27, 2024 13:57

Run black --line-length 119 src

b6b84f4

Merge branch 'main' into main

44dd945

Fix isort in run_orpo.py

3e19259

alvarobartt mentioned this pull request Mar 28, 2024

Doubt about the formatting of the prompt, chosen and rejected xfactlab/orpo#7

Closed

alvarobartt and others added 11 commits March 31, 2024 20:56

Update mistral-capybara/orpo/config_full.yaml

a9964fd

Check if test is available split

2628547

Pin trl to alvarobartt/trl fork (debugging)

1177c12

Add qwen-capybara recipe

089b84d

Update mistral-capybara recipe

9359168

Set add_generation_prompt=True if task="orpo"

ec449c1

Reduce logging_steps to 10

6153f2b

Unset add_generation_prompt when task=orpo

f13d979

Add filtering based on prompt length

a2666b8

Done similarly to the original implementation, in order to better reproduce their results

Fix prompt length filtering

ce0f4f5

Merge branch 'main' into main

507084a

alvarobartt added 4 commits April 10, 2024 08:48

Update trl pinned version

9d93b84

Remove extra outdate config files

d75a4a8

Update recipes/mistral-capybara/orpo/config_full.yaml

914fcd2

Run make style

a0b990f

alvarobartt changed the title ~~Add run_orpo.py (WIP)~~ Add run_orpo.py Apr 10, 2024

lewtun added 4 commits April 11, 2024 11:34

Activate BEAST MODE

c110b71

Pin deps

a756656

Add readme

645bd14

Fix dep

71ceca8

lewtun merged commit 70769f9 into huggingface:main Apr 11, 2024
3 checks passed

alvarobartt mentioned this pull request Apr 15, 2024

Add ORPO within README.md files #154

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `run_orpo.py` #143

Add `run_orpo.py` #143

alvarobartt commented Mar 26, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 26, 2024

nisten commented Apr 12, 2024

alvarobartt commented Apr 14, 2024

Add run_orpo.py #143

Add run_orpo.py #143

Conversation

alvarobartt commented Mar 26, 2024 • edited Loading

Description

Experiments

Mistral-7B-v0.1 fine-tune with argilla/distilabel-capybara-dpo-7k-binarized as in https://huggingface.co/kaist-ai/mistral-orpo-capybara-7k

HuggingFaceDocBuilderDev commented Mar 26, 2024

nisten commented Apr 12, 2024

alvarobartt commented Apr 14, 2024

Add `run_orpo.py` #143

Add `run_orpo.py` #143

alvarobartt commented Mar 26, 2024 •

edited

Loading

Mistral-7B-v0.1 fine-tune with `argilla/distilabel-capybara-dpo-7k-binarized` as in https://huggingface.co/kaist-ai/mistral-orpo-capybara-7k