Distributed RL System for LLM Reasoning
-
Updated
Jun 23, 2025 - Python
Distributed RL System for LLM Reasoning
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Ling is a MoE LLM provided and open-sourced by InclusionAI.
Official code for "Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers"
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
[arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents
[AAAI 2025] ORQA is a new QA benchmark designed to assess the reasoning capabilities of LLMs in a specialized technical domain of Operations Research. The benchmark evaluates whether LLMs can emulate the knowledge and reasoning skills of OR experts when presented with complex optimization modeling tasks.
[ACL'2025 Findings] Official repo for "HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task"
[EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024
Official code for ACL'25 Main: "Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models"
[KDD 2025] Rewarding Graph Reasoning Process makes LLMs more Generalized Reasoners
Self-Improving LLMs Through Iterative Refinement
Reproduction of ICLR 2023 paper "ReAct: Synergizing Reasoning and Acting in Language Models"
This project is designed for Answer-then-Think (AoT) processing with Large Language Models (LLMs). It provides a flexible framework to orchestrate complex reasoning tasks by breaking them down into iterative steps, managing LLM interactions, and dynamically adapting based on problem complexity and resource constraints.
Code for Invesitgating Trace-based Knowledge Distillation on Question-Answering
Add a description, image, and links to the llm-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the llm-reasoning topic, visit your repo's landing page and select "manage topics."