Skip to content

Direct Preference Optimization partial replication

Notifications You must be signed in to change notification settings

aaron-sandoval/DPO

Repository files navigation

Direct Policy Optimization

This is a reimplementation of parts of the DPO paper.

About

Direct Preference Optimization partial replication

Resources

Stars

Watchers

Forks