PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

Python 1,171 190 Updated Feb 9, 2021

zhoubolei / introRL

Intro to Reinforcement Learning (强化学习纲要）

3,323 495 Updated Jul 25, 2020

mihirp1998 / VADER

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various r…

Python 240 15 Updated Aug 19, 2024

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 6,493 715 Updated Mar 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wenqiang Sun wenqsun

Achievements

Achievements

Highlights

Block or report wenqsun

🤔Reinforcement learning

katerakelly / oyster

Trinkle23897 / tianshou

ray-project / ray

lucidrains / PaLM-rlhf-pytorch

zhangchuheng123 / Reinforcement-Implementation

toshikwa / sac-discrete.pytorch

NeuronDance / DeepRL

opendilab / awesome-diffusion-model-in-rl

jannerm / diffuser

Khrylx / PyTorch-RL

zhoubolei / introRL

mihirp1998 / VADER

vwxyzjn / cleanrl