-
Tsinghua University (Graduated)
- California, USA
-
19:58
(UTC -07:00) - https://yushengsu-thu.github.io/
- @thu_yushengsu
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Stars
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
yushengsu-thu / verl
Forked from volcengine/verlverl: Volcano Engine Reinforcement Learning for LLMs
Fully open reproduction of DeepSeek-R1
Sky-T1: Train your own O1 preview model within $450
Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
An Open Large Reasoning Model for Real-World Solutions
800,000 step-level correctness labels on LLM solutions to MATH problems
Using Groq or OpenAI or Ollama to create o1-like reasoning chains
Let your Claude able to think
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
SGLang is a fast serving framework for large language models and vision language models.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
ROCm / Megatron-LM
Forked from NVIDIA/Megatron-LMOngoing research training transformer models at scale
yushengsu-thu / Liger-Kernel
Forked from linkedin/Liger-KernelEfficient Triton Kernels for LLM Training
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Efficient Triton Kernels for LLM Training
Development repository for the Triton language and compiler
A Data Streaming Library for Efficient Neural Network Training
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
ROCm / flash-attention
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.
Code for the curation of The Stack v2 and StarCoder2 training data