🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.
-
Updated
May 11, 2026 - Python
🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
Learning When to Answer: Behavior-Oriented Reinforcement Learning for Hallucination Mitigation
CS336 作业 5:基于 Qwen2.5 模型的 LLM 对齐与推理强化学习。完整实现了监督微调(SFT)与组相对策略优化(GRPO)算法,并在 GSM8K 数据集上完成零样本、在策与离策的训练与评估对比。
Official implementation of "DZ-TiDPO: Non-Destructive Temporal Alignment for Mutable State Tracking". SOTA on Multi-Session Chat with negligible alignment tax.
C3AI: Crafting and Evaluating Constitutions for CAI
Kullback–Leibler divergence Optimizer based on the Neurips25 paper "LLM Safety Alignment is Divergence Estimation in Disguise".
Teacher-guided prompt-shape discovery for auditable moral attention in frozen weak classifiers.
A training-time alignment framework that integrates safety constraints directly into the RLHF loop — achieving full safety convergence in 7 epochs
Pipeline to investigate structured reasoning and instruction adherence in Vision-Language Models
This project implements a minimal Reinforcement Learning from Human Feedback (RLHF) pipeline using PyTorch.
🧠 Minimal, hackable Group Relative Policy Optimization (GRPO) for LLM alignment — the algorithm behind DeepSeek-R1. Train reasoning models on a single GPU.
LLM Post-training(SFT, RLVR, RLHF) 파이프라인 구축 및 평가 실습 아카이브
обучение диалектическому мышлению в сложных социально-политических контекстах
SIGIR 2025 "Mitigating Source Bias with LLM Alignment"
Investigación sobre alineación pragmática de LLMs y Framework de Agentes LANKAMAR. DOI: 10.5281/zenodo.18904437
🏟️ Modern RL algorithms from scratch — from Q-Learning to GRPO — with clean PyTorch code and interactive notebooks. Compare PPO vs DPO vs GRPO for LLM alignment.
A lightweight framework for training and benchmarking custom optimizers in the RLHF pipeline (RM/PPO) using GPT-2 Small and LoRA on consumer GPUs.
FALL 2025 LINGUIS R1B Research Essay, NLP Python Scripts By Shiyi (Yvette) Chen, UC Berkeley
Add a description, image, and links to the llm-alignment topic page so that developers can more easily learn about it.
To associate your repository with the llm-alignment topic, visit your repo's landing page and select "manage topics."