This repo aims to record advanced development and progress on DeepSeek-R1 reasoning.
We strongly encourage the researchers that want to promote their fantastic work to the DeepSeek-R1 to make pull request to update their information!
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
DeepSeek-AI Team
Jan 22, 2025
[Paper Link][Github Repo] [HuggingFace]
Jan 24, 2025
[Github Repo]
7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient
Weihao Zeng, Yuzhen Huang, Wei Liu, Keqing He, Qian Liu, Zejun Ma, Junxian He
Jan 25, 2025
[Paper Link][Github Repo]
TinyZero
Jiayi Pan
Jan 23, 2025
[Github Repo][veRL]
DeepSeek-V3 Technical Report
DeepSeek-AI Team
Dec 27, 2024
[Paper Link][Github Repo]
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, Y.K. Li, Y. Wu, Daya Guo
Feb 5, 2024
[Paper Link][Github Repo]