Skip to content
View michaelzhiluo's full-sized avatar

Highlights

  • Pro

Block or report michaelzhiluo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Democratizing Reinforcement Learning for LLMs

Python 1,374 107 Updated Feb 13, 2025

🐵 Preswald is a full-stack platform for building, deploying, and managing interactive data applications. It brings ingestion, storage, transformation, and visualization into a simple SDK, minimizin…

Python 1,255 47 Updated Feb 15, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,049 1,297 Updated Feb 1, 2025

Ongoing research training transformer models at scale

Python 11,371 2,549 Updated Feb 15, 2025
Python 45 2 Updated Feb 11, 2025

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,617 127 Updated Jan 17, 2025

Ultra | Ultimate | Unified CCL

C++ 30 2 Updated Feb 14, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 9,626 915 Updated Feb 15, 2025

veRL: Volcano Engine Reinforcement Learning for LLM

Python 3,150 267 Updated Feb 15, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,229 78 Updated Feb 4, 2025

Large Reasoning Models

Python 802 44 Updated Dec 3, 2024

Code for the paper 🌳 Tree Search for Language Model Agents

Python 175 20 Updated Jul 25, 2024

[ICML 2024] Official repository for "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"

Python 729 74 Updated Jul 30, 2024

Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.

Python 303 23 Updated Nov 26, 2024

Google Research

Jupyter Notebook 34,900 8,012 Updated Feb 11, 2025

LLM verified with Monte Carlo Tree Search

Jupyter Notebook 263 27 Updated Feb 7, 2025

Optimizing inference proxy for LLMs

Python 2,027 158 Updated Feb 14, 2025

Build resilient language agents as graphs.

Python 9,012 1,480 Updated Feb 15, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 733 29 Updated Sep 21, 2024

Automating enterprise workflows with multimodal agents

Jupyter Notebook 99 14 Updated Oct 9, 2024

Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"

Python 2,279 166 Updated Dec 11, 2024

Large Context Attention

Python 681 53 Updated Jan 24, 2025
Python 112 9 Updated Aug 13, 2024

Visual Studio Code

TypeScript 167,350 30,533 Updated Feb 15, 2025

VS Code in the browser

TypeScript 69,795 5,767 Updated Feb 14, 2025

[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable

Python 143 8 Updated Sep 21, 2024

TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.

Python 2,076 176 Updated Jan 28, 2025
Python 43 5 Updated Jun 27, 2024
Next