Skip to content

Commit

Permalink
Update RLHF_with__PPO.md
Browse files Browse the repository at this point in the history
  • Loading branch information
shaheennabi authored Nov 21, 2024
1 parent 368a9af commit f8f2b01
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/RLHF_with__PPO.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Reinforcement Learning from Human Feedback with PPO

![Uploading Screenshot 2024-11-21 083539.png…]()


What is it, and why is it so confusing? Well, in this file, I will take you on a new adventure, and we will learn what **RLHF** with **PPO** actually means.

So, have you heard about the Reinforcement Learning field first? Oh! I guess that was the hot field in 2016, 2017, 2018... when Google DeepMind released the AlphaGo. Wow, it was amazing.
Expand Down

0 comments on commit f8f2b01

Please sign in to comment.