-
oat Public
Forked from sail-sg/oat🌾 OAT: Online AlignmenT for LLMs
Python Apache License 2.0 UpdatedJan 29, 2025 -
-
IQ-Learn Public
Forked from Div99/IQ-Learn(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation
-
alpaca_eval Public
Forked from tatsu-lab/alpaca_evalAn automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Jupyter Notebook Apache License 2.0 UpdatedOct 13, 2024 -
SimPO Public
Forked from princeton-nlp/SimPOSimPO: Simple Preference Optimization with a Reference-Free Reward
Python UpdatedMay 29, 2024 -
flow-iar Public
A PyTorch implementation of the flow policy with invalid action rejection for large discrete (categorical) action space with constraints.
-
mgm Public
A PyTorch implementation of Multiscale Generative Models.
-
imitation Public
Forked from HumanCompatibleAI/imitationClean PyTorch implementations of imitation and reward learning algorithms
Python MIT License UpdatedOct 26, 2021