Stars
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
This repository contains all necessary meta information, results and source files to reproduce the results in the publication Eric Müller-Budack, Kader Pustu-Iren, Ralph Ewerth: "Geolocation Estima…
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Machine Learning and Computer Vision Engineer - Technical Interview Questions
Continuous Thought Machines, because thought takes time and reasoning is a process.
[COLM'25] The official implementation of "LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception"
The simplest, fastest repository for training/finetuning small-sized VLMs.
Witness the aha moment of VLM with less than $3.
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
verl: Volcano Engine Reinforcement Learning for LLMs
Understanding R1-Zero-Like Training: A Critical Perspective
Embodied Reasoning Question Answer (ERQA) Benchmark
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Eagle: Frontier Vision-Language Models with Data-Centric Strategies
A fork to add multimodal model training to open-r1
Collection of awesome parameter-efficient fine-tuning resources.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
List of papers on Self-Correction of LLMs.
PushWorld: A benchmark for manipulation planning with tools and movable obstacles
A collection of PDDL generators, some of which have been used to generate benchmarks for the International Planning Competition (IPC).
Official release of the benchmark in paper "VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs"
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A bibliography and survey of the papers surrounding o1
Efficient LLM inference on Slurm clusters using vLLM.
A high-throughput and memory-efficient inference and serving engine for LLMs