Codes accompanying the paper "Score Regularized Policy Optimization through Diffusion Behavior" (ICLR 2024).
reinforcement-learning offline rl generative diffusion score-based-models d4rl srpo behavior-regularization
-
Updated
Feb 10, 2024 - Python