Deep RL Projects

Implementation of deep reinforcement learning models

Models

Soft Actor Critic
DARC
GAIL (In Progress)

SAC

Soft actor critic is an off-policy model that attempts to maximize reward as well as entropy of its actions. With its objective being
$J(\theta) = E\[\sum_t r(s_t, a_t) - \alpha * \log(\pi(a_t|s_t))\]$
This pushes the policy to balance between exploration and exploitation of its environment with minimum number of hyperparameters to tune.
The policy uses a gaussian distribution for continuous action prediction and the value network uses a twin q-net to prevent explosive growth in reward.

DARC

DARC builds on top of SAC for transfer from source to target domain by attempting to match transition probabilities. This is done through an additional classifier for classification between source and target domains and adding reward based on dynamics adaptation.
$\delta r(s_t, a_t, s')=log p_{target}(target | s_t, a_t, s') - log p(target |s_t, a_t) - log p (source|s_t, a_t, s') log p(source |s_t, a_t)$

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
__pycache__		__pycache__
architectures		architectures
environments		environments
models		models
runs/latest_runs		runs/latest_runs
saved_weights/Ant-v2-SAC-200		saved_weights/Ant-v2-SAC-200
README.md		README.md
env_test.py		env_test.py
replay_buffer.py		replay_buffer.py
tensor_writer.py		tensor_writer.py
train.py		train.py
train_darc.py		train_darc.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep RL Projects

Models

SAC

DARC

About

Releases

Packages

Languages

yiliu77/deep_rl_proj

Folders and files

Latest commit

History

Repository files navigation

Deep RL Projects

Models

SAC

DARC

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages