一份全栈式大语言模型参考指南,用最简洁的代码帮助你端到端定义模型从零训练到工程落地的每一个细节
-
Updated
Nov 9, 2025
一份全栈式大语言模型参考指南,用最简洁的代码帮助你端到端定义模型从零训练到工程落地的每一个细节
[ICLR'25] MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Add a description, image, and links to the posttrain topic page so that developers can more easily learn about it.
To associate your repository with the posttrain topic, visit your repo's landing page and select "manage topics."