PBRL

Introduction

This is a Pytorch implementation for our paper on

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning, ICLR 2022.

Prerequisites

Python3.6 or 3.7 with pytorch 1.8
D4RL with v2 dataset
OpenAI Gym with mujoco-py

Installation and Usage

Install the package of rlkit with

cd d4rl
pip install -e .

For running PBRL on the MuJoCo environments, run:

python examples/pevi_mujoco.py --env walker2d-medium-v2 --gpu 0

For running PBRL-Prior on the MuJoCo environments, run:

python examples/pevi_mujoco.py --env walker2d-medium-v2 --prior --gpu 0

For running PBRL on the Adroit environments, run:

python examples/pevi_adroit.py --env pen-cloned-v0 --gpu 0

For running PBRL-Prior on the Adroit environments, run:

python examples/pevi_adroit.py --env pen-cloned-v0 --prior --gpu 0

The core implementation is given in d4rl/rlkit/torch/sac/pevi.py

Execution

The data for separate runs is stored on disk under the result directory with filename <env-id>-<timestamp>/<seed>/. Each run directory contains

debug.log Record the epoch, Q-value, Uncertainty-value, scores.
progress.csv Same data as debug.log but with csv format.
variant.json The hyper-parameters in training.
models The final actor-critic network.

The evaluation/d4rl score in debug.log or progress.csv records the normalized score in our paper.

In case of any questions, bugs, suggestions or improvements, please feel free to open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
d4rl		d4rl
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PBRL

Introduction

Prerequisites

Installation and Usage

Execution

About

Releases

Packages

Languages

Baichenjia/PBRL

Folders and files

Latest commit

History

Repository files navigation

PBRL

Introduction

Prerequisites

Installation and Usage

Execution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages