rhlf

Here are 2 public repositories matching this topic...

michaelnny / InstructLLaMA

Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.

ppo rhlf instructgpt qlora llam2 4bit-fine-tune

Updated Mar 9, 2024
Jupyter Notebook

kantkrishan0206-crypto / LLM-building-a-Large-Language-Model-LLM-

Star

is a comprehensive, educational project dedicated to building a Large Language Model (LLM) from the ground up. It serves as the official code repository for the book Build a Large Language Model (From Scratch), guiding developers step-by-step through the process of developing, pretraining, finetuning, and aligning a GPT-like LLM using PyTorch.

reinforcement-learning deep-learning neural-network numpy machine-learning-algorithms transformers pytorch artificial-intelligence model-training rhlf tokenizers

Updated Oct 5, 2025
Python

Improve this page

Add a description, image, and links to the rhlf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rhlf topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly