-
Nanyang Technological University, Singapore
- Singapore
Highlights
- Pro
Stars
SGLang is a fast serving framework for large language models and vision language models.
Memory-Guided Diffusion for Expressive Talking Video Generation
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
Recipes to train reward model for RLHF.
A series of math-specific large language models of our Qwen2 series.
Python wrapper and simple addons for sioyek PDF viewer
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
800,000 step-level correctness labels on LLM solutions to MATH problems
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
Really Fast End-to-End Jax RL Implementations
Assetto Corsa OpenAI Gym Environment
Official release for the code used in paper: Learning from Active Human Involvement through Proxy Value Propagation (NeurIPS 2023 Spotlight)
Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)
Implementation of Robust Imitation Learning against Variations in Environment Dynamics
This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…
Code for Adapting Environment Sudden Changes by Learning Context Sensitive Policy
[ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents
[NeurIPS 2023] The official code for paper "State Regularized Policy Optimization on Data with Dynamics Shift"
OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]
RL starter files in order to immediately train, visualize and evaluate an agent without writing any line of code
Based on the learnware paradigm, the learnware package supports the entire process including the submission, usability testing, organization, identification, deployment, and reuse of learnwares. Si…
✯ 可直连访问的电视/广播图标库与相关工具项目 ✯ 🔕 永久免费 直连访问 完整开源 不断完善的台标 支持IPv4/IPv6双栈访问 🔕
[IJCAI'24] An index of algorithms, approaches, and systems on cross-domain policy transfer for embodied agents
A natural language interface for computers