Skip to content

MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

Notifications You must be signed in to change notification settings

LxzGordon/mbpo_pytorch_offline

Repository files navigation

offline-MBPO

This repository contains the code of a version of model-based RL algorithm MBPO, which is modified to perform in offline RL settings
Paper:When to trust your model: Model-based policy optimization
With much thanks, this code is based on Xingyu-Lin's easy-to-read pytorch implementation of MBPO

Requirements

See requirements.txt
The code depends on D4RL's environments and datasets
Only support hopper, walker, halfcheetah and ant environments right now (if you wish to evaluate in other environments, modify the termination function in predict_env.py)

Usage

Simply run

  
python main_mbpo.py --env_name=halfcheetah-medium-v0 --seed=1234
  

Or modify the script runalgo.sh, then

  
bash runalgo.sh
  

Results

About

MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published