Skip to content
View yushengsu-thu's full-sized avatar

Highlights

  • Pro

Organizations

@thunlp @ROCm

Block or report yushengsu-thu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,775 203 Updated Mar 4, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 1 Updated Mar 5, 2025

Fully open reproduction of DeepSeek-R1

Python 22,920 2,077 Updated Mar 16, 2025

Sky-T1: Train your own O1 preview model within $450

Python 3,136 315 Updated Mar 12, 2025

Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".

Python 51 Updated Nov 29, 2024

An Open Large Reasoning Model for Real-World Solutions

Python 1,473 78 Updated Mar 4, 2025

Analyzing and scoring reasoning traces of LLMs

Python 45 Updated Sep 1, 2024

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 1,950 114 Updated Jun 1, 2023

Using Groq or OpenAI or Ollama to create o1-like reasoning chains

Python 298 46 Updated Sep 17, 2024

Let your Claude able to think

TypeScript 14,735 1,713 Updated Mar 10, 2025

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 16,471 2,380 Updated Mar 17, 2025

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 4,195 380 Updated Jan 27, 2025

Large Reasoning Models

Python 799 45 Updated Dec 3, 2024

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 163 5 Updated Mar 6, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 12,046 1,275 Updated Mar 18, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,576 364 Updated Mar 14, 2025

O1 Replication Journey

1,973 65 Updated Jan 14, 2025
Python 1,347 52 Updated Nov 21, 2024

Ongoing research training transformer models at scale

Python 17 17 Updated Mar 17, 2025

Efficient Triton Kernels for LLM Training

Python 1 Updated Sep 2, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 2,301 171 Updated Mar 4, 2025

Efficient Triton Kernels for LLM Training

Python 4,667 282 Updated Mar 17, 2025

Development repository for the Triton language and compiler

MLIR 14,902 1,866 Updated Mar 18, 2025

A Data Streaming Library for Efficient Neural Network Training

Python 1,255 157 Updated Mar 5, 2025

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 2,111 133 Updated Mar 17, 2025

Fast and memory-efficient exact attention

Python 161 51 Updated Mar 17, 2025

A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.

Python 506 42 Updated Feb 25, 2025

Code for the curation of The Stack v2 and StarCoder2 training data

Jupyter Notebook 96 7 Updated Apr 11, 2024

LLM101n: Let's build a Storyteller

32,725 1,790 Updated Aug 1, 2024
Next