sh-jj

🍖

Meat!

Jie-Jing Shao sh-jj

🍖

Meat!

21 followers · 22 following

Nanjing University
www.lamda.nju.edu.cn/shaojj

Achievements

Stars

Large Model

68 repositories

bigai-ai / civrealm

CivRealm is an interactive environment for the open-source strategy game Freeciv-web based on Freeciv, a Civilization-inspired game.

Python 106 8 Updated Sep 11, 2024

CraftJarvis / MC-Planner

Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents"

Python 270 22 Updated Aug 3, 2023

clvrai / boss

Code for the paper Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance, accepted to CoRL 2023 as an Oral Presentation.

Python 27 2 Updated Aug 13, 2024

ryoungj / ToolEmu

[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use

Python 130 14 Updated Mar 22, 2024

clova-tool / CLOVA-tool

Python 29 1 Updated Jun 19, 2024

allenai / visprog

Official code for VisProg (CVPR 2023 Best Paper!)

Python 709 66 Updated Aug 26, 2024

sierra-research / tau-bench

Code and Data for Tau-Bench

Python 325 47 Updated Jan 22, 2025

open-compass / GTA

[NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents

Python 75 6 Updated Feb 13, 2025

OSU-NLP-Group / TravelPlanner

[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"

Python 323 45 Updated Dec 11, 2024

openai / safety-rbr-code-and-data

Code and example data for the paper: Rule Based Rewards for Language Model Safety

Jupyter Notebook 180 16 Updated Jul 19, 2024

SakanaAI / AI-Scientist

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 9,178 1,350 Updated Feb 7, 2025

rail-berkeley / soar

Code release for paper "Autonomous Improvement of Instruction Following Skills via Foundation Models" | CoRL 2024

Python 68 5 Updated Jan 9, 2025

ReproModel / repromodel

Boosting the AI research efficiency

Python 151 22 Updated Sep 24, 2024

allenai / marg-reviewer

Code/data for MARG (multi-agent review generation)

Python 40 4 Updated Nov 14, 2024

THUDM / Android-Lab

Python 194 10 Updated Nov 22, 2024

princeton-nlp / tree-of-thought-llm

[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Python 5,136 492 Updated Jan 16, 2025

google-research / text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,289 763 Updated Feb 27, 2025

google-research / t5x

Python 2,763 310 Updated Mar 6, 2025

PiotrNawrot / nanoT5

Fast & Simple repository for pre-training and fine-tuning T5-style models

Python 997 76 Updated Aug 21, 2024

Shivanandroy / T5-Finetuning-PyTorch

Fine tune a T5 transformer model using PyTorch & Transformers🤗

Jupyter Notebook 209 34 Updated Feb 10, 2021

google / BIG-bench

Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models

Python 2,986 599 Updated Jul 19, 2024

CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,595 480 Updated Jan 8, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 43,639 5,339 Updated Mar 10, 2025

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,788 231 Updated Feb 19, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 12,381 1,669 Updated Mar 7, 2025

ebmoon / transformers-GAD

[NeurIPS'24] Grammar-Aligned Decoding: An algorithm to constrain LLMs' outputs without distorting its original distribution

Python 14 4 Updated Feb 10, 2025

OpenLMLab / MOSS-RLHF

Secrets of RLHF in Large Language Models Part I: PPO

Python 1,327 97 Updated Mar 3, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,470 538 Updated Mar 10, 2025

THUDM / ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 587 44 Updated Jan 20, 2025

zhentingqi / rStar

Python 905 105 Updated Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jie-Jing Shao sh-jj

Achievements

Achievements

Block or report sh-jj

Large Model

bigai-ai / civrealm

CraftJarvis / MC-Planner

clvrai / boss

ryoungj / ToolEmu

clova-tool / CLOVA-tool

allenai / visprog

sierra-research / tau-bench

open-compass / GTA

OSU-NLP-Group / TravelPlanner

openai / safety-rbr-code-and-data

SakanaAI / AI-Scientist

rail-berkeley / soar

ReproModel / repromodel

allenai / marg-reviewer

THUDM / Android-Lab

princeton-nlp / tree-of-thought-llm

google-research / text-to-text-transfer-transformer

google-research / t5x

PiotrNawrot / nanoT5

Shivanandroy / T5-Finetuning-PyTorch

google / BIG-bench

CarperAI / trlx

hiyouga / LLaMA-Factory

opendilab / awesome-RLHF

huggingface / trl

ebmoon / transformers-GAD

OpenLMLab / MOSS-RLHF

OpenRLHF / OpenRLHF

THUDM / ReST-MCTS

zhentingqi / rStar