Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
GaryYufei authored Aug 4, 2023
1 parent e99ce2f commit e5d5b31
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,9 @@ We hope this repository can help researchers and practitioners to get a better u
### Online Human Alignment
- Training language models to follow instructions with human feedback [[Paper]](https://openreview.net/forum?id=TG8KACxEON)
- RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment [[Paper]](https://arxiv.org/abs/2304.06767)
- Constitutional AI: Harmlessness from AI Feedback [[Paper]](Constitutional AI: Harmlessness from AI Feedback)
- RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment [[Paper]](https://arxiv.org/abs/2307.12950)

### Offline Human Alignment
#### Rank-based Training
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model [[Paper]](https://arxiv.org/abs/2305.18290)
Expand Down

0 comments on commit e5d5b31

Please sign in to comment.