Skip to content
View andrewliao11's full-sized avatar

Highlights

  • Pro

Block or report andrewliao11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 1,899 119 Updated Sep 1, 2025

This repository contains all necessary meta information, results and source files to reproduce the results in the publication Eric Müller-Budack, Kader Pustu-Iren, Ralph Ewerth: "Geolocation Estima…

Python 151 37 Updated Mar 23, 2023

Simple RL training for reasoning

Python 3,742 281 Updated Aug 3, 2025

procedural reasoning datasets

Python 1,100 89 Updated Sep 15, 2025

GeoGuessr benchmark for language models

Python 37 2 Updated Aug 6, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,587 269 Updated Sep 13, 2025

Machine Learning and Computer Vision Engineer - Technical Interview Questions

4,058 658 Updated May 20, 2025

Continuous Thought Machines, because thought takes time and reasoning is a process.

Python 1,294 190 Updated Jul 14, 2025

[COLM'25] The official implementation of "LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception"

Python 7 2 Updated Aug 4, 2025

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,030 382 Updated Sep 10, 2025

Witness the aha moment of VLM with less than $3.

Python 3,935 292 Updated May 19, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,292 181 Updated Sep 6, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 13,390 2,363 Updated Sep 15, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,081 51 Updated Aug 27, 2025

Embodied Reasoning Question Answer (ERQA) Benchmark

Python 212 10 Updated Mar 12, 2025

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023

Jupyter Notebook 2,973 644 Updated Mar 16, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 71,255 10,246 Updated Sep 14, 2025

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Python 132 1 Updated Apr 24, 2025

Eagle: Frontier Vision-Language Models with Data-Centric Strategies

Python 865 48 Updated Aug 8, 2025

A fork to add multimodal model training to open-r1

Python 1,393 69 Updated Feb 8, 2025

Collection of awesome parameter-efficient fine-tuning resources.

570 15 Updated Jul 12, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,832 376 Updated Sep 6, 2025

List of papers on Self-Correction of LLMs.

74 2 Updated Dec 28, 2024

PushWorld: A benchmark for manipulation planning with tools and movable obstacles

Python 83 15 Updated May 3, 2024

A collection of PDDL generators, some of which have been used to generate benchmarks for the International Planning Competition (IPC).

C 123 22 Updated May 21, 2025

Official release of the benchmark in paper "VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs"

Python 13 1 Updated Aug 1, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 58,317 7,169 Updated Sep 15, 2025

A bibliography and survey of the papers surrounding o1

TeX 1,206 50 Updated Nov 16, 2024

Efficient LLM inference on Slurm clusters using vLLM.

Python 77 11 Updated Sep 15, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 58,112 10,139 Updated Sep 15, 2025
Next