yushengsu-thu

Yusheng (Ethan) Su yushengsu-thu

#ML #NLP #LLM Goal: Building a model toward AGI.

81 followers · 93 following

Tsinghua University (Graduated)
California, USA
19:58 (UTC -07:00)
https://yushengsu-thu.github.io/
@thu_yushengsu

Achievements

Highlights

Organizations

Lists (3)

Sort

Stars

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,775 203 Updated Mar 4, 2025

yushengsu-thu / verl

Forked from volcengine/verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 1 Updated Mar 5, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 22,920 2,077 Updated Mar 16, 2025

NovaSky-AI / SkyThought

Sky-T1: Train your own O1 preview model within $450

Python 3,136 315 Updated Mar 12, 2025

WooooDyy / MathCritique

Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".

Python 51 Updated Nov 29, 2024

AIDC-AI / Marco-o1

An Open Large Reasoning Model for Real-World Solutions

Python 1,473 78 Updated Mar 4, 2025

logikon-ai / logikon

Analyzing and scoring reasoning traces of LLMs

Python 45 Updated Sep 1, 2024

openai / prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 1,950 114 Updated Jun 1, 2023

win4r / o1

Using Groq or OpenAI or Ollama to create o1-like reasoning chains

Python 298 46 Updated Sep 17, 2024

trotsky1997 / MathBlackBox

Python 1,011 102 Updated Dec 17, 2024

richards199999 / Thinking-Claude

Let your Claude able to think

TypeScript 14,735 1,713 Updated Mar 10, 2025

meta-llama / llama-cookbook

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 16,471 2,380 Updated Mar 17, 2025

bklieger-groq / g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 4,195 380 Updated Jan 27, 2025

SimpleBerry / LLaMA-O1

Large Reasoning Models

Python 799 45 Updated Dec 3, 2024

princeton-nlp / ProLong

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 163 5 Updated Mar 6, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 12,046 1,275 Updated Mar 18, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,576 364 Updated Mar 14, 2025

GAIR-NLP / O1-Journey

O1 Replication Journey

1,973 65 Updated Jan 14, 2025

Open-Source-O1 / Open-O1

Python 1,347 52 Updated Nov 21, 2024

ROCm / Megatron-LM

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

Python 17 17 Updated Mar 17, 2025

yushengsu-thu / Liger-Kernel

Forked from linkedin/Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 1 Updated Sep 2, 2024

Python 161 51 Updated Mar 17, 2025

Yusheng (Ethan) Su yushengsu-thu

Highlights

Organizations

Lists (3)

In-processing-project

Leaded Projects

Reference_project

Stars