Skip to content

Replace CC tanh activation with swish #901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 22, 2018
Merged

Replace CC tanh activation with swish #901

merged 2 commits into from
Jun 22, 2018

Conversation

awjuliani
Copy link
Contributor

@awjuliani awjuliani commented Jun 21, 2018

Improves performance by ~25% on Walker and Crawler. Slight adjustments to 3DBall to ensure no performance loss.

Crawler and Walker models on master were actually trained using swish, so making the PR into hotfix branch for consistency.

@awjuliani awjuliani requested a review from vincentpierre June 21, 2018 18:16
@xiaomaogy
Copy link
Contributor

@awjuliani Why does changing the batch size from 1200 to 64 for 3dball improves the performance?

@xiaomaogy
Copy link
Contributor

@awjuliani Also shall we merge the hotfix-0 branch after this merge?

@awjuliani
Copy link
Contributor Author

@xiaomaogy This was just what empirically seemed to work well. For some reason the value function estimate for 3DBall is relatively poor. My hypothesis is that decreasing the batch size allowed for better gradients for learning the value function. Large batch sizes are actually bad in most problems. They are only really used in training continuous control models because it is "good" that the gradients all wash out, since it doesn't push the policy too much in any direction. I think this is likely bad for the value estimate, since it means it may not be able to keep up with the true value function of the environment. The reason for changing the lambda parameter is similar. By increasing it, the algorithm relies less on the value estimate when producing the advantage, as as such isn't as negatively effected by the estimate being poor.

Of course the question of why this wasn't an issue with the tanh is a bit of a mystery to me still.

@xiaomaogy xiaomaogy requested review from xiaomaogy and removed request for xiaomaogy June 22, 2018 18:49
@awjuliani awjuliani merged commit f3f7205 into hotfix-0 Jun 22, 2018
@awjuliani awjuliani deleted the hotfix-swish branch June 22, 2018 19:08
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants