Skip to content

modelscope/awesome-deep-reasoning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 

Repository files navigation

Awesome-deep-reasoning

Collect the awesome works evolved around reasoning models like O1/R1! You can also find the collection ModelScope-r1-collection | HuggingFace-r1-collection

Table of Contents

News

  • 🔥 [2025.04.23] Add section "Advanced Reasoning for Agent", including Search-R1, Re-Search, R1-Searcher, ...
  • 🔥 [2025.03.21] Add DAPO - DAPO: An Open-Source LLM Reinforcement Learning System at Scale
  • 🔥 [2025.03.18] Add Skywork-R1V - Pioneering Multimodal Reasoning with CoT
  • 🔥 [2025.03.17] Add START: Self-taught Reasoner with Tools from Qwen Team - START
  • 🔥 [2025.03.12] Add Multi-modal Reasoning datasets: LLaVA-R1-100k and MMMU-Reasoning-R1-Distill-Validation
  • 🔥 [2025.03.04] Add the Visual-RFT - Visual Reinforcement Fine-Tuning
  • 🔥 [2025.03.01] DeepSeek has released the smallpond - A lightweight data processing framework built on DuckDB and 3FS.
  • 🔥 [2025.02.28] DeepSeek has released the 3FS - A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
  • 🔥 [2025.02.27] DeepSeek has released the DualPipe - DualPipe achieves full overlap of forward and backward computation-communication phases, also reducing pipeline bubbles.
  • 🔥 [2025.02.27] DeepSeek has released the ProfileData -The communication-computation overlap profiling strategies and low-level implementation details based on PyTorch.
  • 🔥 [2025.02.26] DeepSeek has released the DeepGEMM - Clean and efficient FP8 GEMM kernels with fine-grained scaling
  • OpenAI publishes a deep-research capability.
  • OpenAI has launched the latest o3 model: o3-mini & o3-mini-high, which specifically support science, math and coding. These two models are available in ChatGPT App, Poe, etc.
  • NVIDIA-NIM has supported the DeepSeek-R1 model.
  • Qwen has launched a powerful multi-modal MoE model: Qwen2.5-Max, this model is available in the Bailian platform.
  • CodeGPT: VSCode co-pilot now supports R1.

Highlights

DeepSeek repos:

DeepSeek-R1 Stars - DeepSeek-R1 official repository.

Qwen repos:

Qwen-QwQ Stars - Qwen 2.5 official repository, with QwQ.

S1 from stanford - From Feifei Li team, a distillation and test-time compute impl which can match the performance of O1 and R1.

Papers

2025.04

  • ReSearch - ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
  • Search-R1 - Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
  • R1-Searcher - R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

2025.03

2025.02

2025.01

2024

Blogs

Models

DeepSeek series:

Model ID ModelScope Hugging Face
DeepSeek R1 Model Link Model Link
DeepSeek V3 Model Link Model Link
DeepSeek-R1-Distill-Qwen-32B Model Link Model Link
DeepSeek-R1-Distill-Qwen-14B Model Link Model Link
DeepSeek-R1-Distill-Llama-8B Model Link Model Link
DeepSeek-R1-Distill-Qwen-7B Model Link Model Link
DeepSeek-R1-Distill-Qwen-1.5B Model Link Model Link
DeepSeek-R1-GGUF Model Link Model Link
DeepSeek-R1-Distill-Qwen-32B-GGUF Model Link Model Link
DeepSeek-R1-Distill-Llama-8B-GGUF Model Link Model Link

Qwen series:

Model ID ModelScope Hugging Face
QwQ-32B-Preview Model Link Model Link
QVQ-72B-Preview Model Link Model Link
QwQ-32B-Preview-GGUF Model Link Model Link
QVQ-72B-Preview-bnb-4bit Model Link Model Link

Others:

Model ID ModelScope Hugging Face
Qwen2-VL-2B-GRPO-8k - Model Link

Infra

Datasets

Evaluation

RelatedRepos

Replicates of DeepSeek-R1 and DeepSeek-R1-Zero

  1. HuggingFace Open R1
  2. Simple Reinforcement Learning for Reasoning
  3. oatllm
  4. TinyZero
  5. 32B-DeepSeek-R1-Zero
  6. X-R1
  7. Open-Reasoner-Zero
  8. Logic-RL - Reproduce R1 Zero on Logic Puzzle

Advanced Reasoning for Coding

  1. SWE-RL - Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Advanced Reasoning for Multi-Modal

  1. R1-V - Multi-modal R1
  2. Open-R1-Multimodal - A multimodal reasoning model based on OpenR1
  3. R1-Multimodal-Journey - A journey to replicate multimodal reasoning model based on Open-R1-Multimodal
  4. VLM-R1 | DEMO - A stable and generalizable R1-style Large Vision-Language Model
  5. Video-R1 - Towards Super Reasoning Ability in Video Understanding MLLMs
  6. VL-Thinking - An R1-Derived Visual Instruction Tuning Dataset for Thinkable LVLMs
  7. Open-R1-Multimodal - A fork to add multimodal model training to open-r1
  8. Visual-RFT - Visual Reinforcement Fine-Tuning
  9. Skywork-R1V
  10. R1-Omni - Explainable Omni-Multimodal Emotion Recognition with Reinforcement Learning
  11. R1-OneVision - A visual language model capable of deep CoT reasoning

Advanced Reasoning for Agent

  1. Search-R1 - An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
  2. ReSearch - ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
  3. R1-Searcher - R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
  4. UI-TARS - Pioneering Automated GUI Interaction with Native Agents

Star History

Star History Chart

About

Collect every awesome work about r1!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages