Skip to content
View sh-jj's full-sized avatar
🍖
Meat!
🍖
Meat!

Block or report sh-jj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Large Model

68 repositories

CivRealm is an interactive environment for the open-source strategy game Freeciv-web based on Freeciv, a Civilization-inspired game.

Python 106 8 Updated Sep 11, 2024

Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents"

Python 270 22 Updated Aug 3, 2023

Code for the paper Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance, accepted to CoRL 2023 as an Oral Presentation.

Python 27 2 Updated Aug 13, 2024

[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use

Python 130 14 Updated Mar 22, 2024
Python 29 1 Updated Jun 19, 2024

Official code for VisProg (CVPR 2023 Best Paper!)

Python 709 66 Updated Aug 26, 2024

Code and Data for Tau-Bench

Python 325 47 Updated Jan 22, 2025

[NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents

Python 75 6 Updated Feb 13, 2025

[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"

Python 323 45 Updated Dec 11, 2024

Code and example data for the paper: Rule Based Rewards for Language Model Safety

Jupyter Notebook 180 16 Updated Jul 19, 2024

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 9,178 1,350 Updated Feb 7, 2025

Code release for paper "Autonomous Improvement of Instruction Following Skills via Foundation Models" | CoRL 2024

Python 68 5 Updated Jan 9, 2025

Boosting the AI research efficiency

Python 151 22 Updated Sep 24, 2024

Code/data for MARG (multi-agent review generation)

Python 40 4 Updated Nov 14, 2024
Python 194 10 Updated Nov 22, 2024

[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Python 5,136 492 Updated Jan 16, 2025

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,289 763 Updated Feb 27, 2025
Python 2,763 310 Updated Mar 6, 2025

Fast & Simple repository for pre-training and fine-tuning T5-style models

Python 997 76 Updated Aug 21, 2024

Fine tune a T5 transformer model using PyTorch & Transformers🤗

Jupyter Notebook 209 34 Updated Feb 10, 2021

Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models

Python 2,986 599 Updated Jul 19, 2024

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,595 480 Updated Jan 8, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 43,639 5,339 Updated Mar 10, 2025

A curated list of reinforcement learning with human feedback resources (continually updated)

3,788 231 Updated Feb 19, 2025

Train transformer language models with reinforcement learning.

Python 12,381 1,669 Updated Mar 7, 2025

[NeurIPS'24] Grammar-Aligned Decoding: An algorithm to constrain LLMs' outputs without distorting its original distribution

Python 14 4 Updated Feb 10, 2025

Secrets of RLHF in Large Language Models Part I: PPO

Python 1,327 97 Updated Mar 3, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,470 538 Updated Mar 10, 2025

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 587 44 Updated Jan 20, 2025
Python 905 105 Updated Jan 23, 2025