Update RLHF_with__PPO.md

shaheennabi · Nov 21, 2024 · f8f2b01 · f8f2b01
1 parent 368a9af
commit f8f2b01
Showing 1 changed file with 3 additions and 0 deletions.
diff --git a/docs/RLHF_with__PPO.md b/docs/RLHF_with__PPO.md
@@ -1,5 +1,8 @@
 # Reinforcement Learning from Human Feedback with PPO
 
+![Uploading Screenshot 2024-11-21 083539.png…]()
+
+
 What is it, and why is it so confusing? Well, in this file, I will take you on a new adventure, and we will learn what **RLHF** with **PPO** actually means.
 
 So, have you heard about the Reinforcement Learning field first? Oh! I guess that was the hot field in 2016, 2017, 2018... when Google DeepMind released the AlphaGo. Wow, it was amazing.