RL(HF) and LLM Reasoning Summary Notes

Welcome to my collection of study notes related to Reinforcement Learning (Human Feedback) and Large Language Model Reasoning. This repository is an evolving resource, containing literature references and personal summaries of the topics I’m currently exploring.

My notes draw on public information from Arxiv, with all opinions being my own.

Overview

Purpose: The aim of this repository is to share my ongoing learning process and provide an indexed collection of research articles and conceptual overviews in topics related to RL(HF) and LLM reasoning.
Scope: Notes include brief summaries, personal annotations, and relevant citations.

Repository Structure

.
├── Notes
│   ├── Intro_RLHF.md
│   ├── Reward_Hacking.md
│   ├── R1_reasoning.md
│   └── ...
└── README.md

Index of Notes

Below is an index of the Markdown files currently available in the Notes folder, along with brief descriptions.

Intro_RLHF.md
A brief and partial summary of RLHF algorithms by the end of 2024. It summarizes a list of papers and useful blogs for RLHF that are covered in my reading group presentation of a brief summary for RLHF algorithms. Please find the slides here.
Reward_Hacking.md
R1_reasoning.md

Getting Involved

Suggestions & Feedback: If you have ideas on improving the notes, please open an issue or share a pull request.

All kinds of contributions—small fixes or new references—are most welcome :) Thanks!

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Notes		Notes
Slides		Slides
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RL(HF) and LLM Reasoning Summary Notes

Overview

Repository Structure

Index of Notes

Getting Involved

About

Uh oh!

Releases

Packages

yihedeng9/rlhf-summary-notes

Folders and files

Latest commit

History

Repository files navigation

RL(HF) and LLM Reasoning Summary Notes

Overview

Repository Structure

Index of Notes

Getting Involved

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages