Noisy Networks for Exploration

**Describe the feature and the current behavior/state.**
Noisy Networks are a relatively recent advancement in Reinforcement Learning (RL). Normally we use epsilon (probability of random action) to add noise after the forward pass of a network. Epsilon forces the agent to explore new areas and discover new rewards. This problem is that we have to decay this over time otherwise the agent is going to make random actions the entire game and never test out its skills. And if the decay curve isn't great the agent may either never learn the environment due to lack of exploration or take ages to learn due to excess exploration. The solution to this problem is noisy networks where the idea is that we add parameter noise to every dense layer in the agent's neural network to make it explore. But this parameter noise is controlled through gradient descent, so as the agent improves it will decay the parameter noise and control the exploration for us. This exploration is much more efficient because it only decreases as the agent learns, and it makes life easier for the developer since there are fewer parameters to tweak now that epsilon decay is no longer a problem.

To create a noisy layer all we need to do is take a normal dense layer and add some weighted noise to the normal weights. As the agent improves it will decrease the noise's weights leaving the normal weights with little or no change.

![image](https://user-images.githubusercontent.com/52867365/91475056-e3323400-e84f-11ea-8797-3f755a665b62.png)

**Relevant information**
- Are you willing to contribute it: Yes
- Are you willing to maintain it going forward?: Yes
- Is there a relevant academic paper?: Yes, [Noisy Networks for Exploration](https://arxiv.org/pdf/1706.10295.pdf), from Deepmind.
- Is there already an implementation in another framework?: Yes, it has been implemented in some RL frameworks like [Coach](https://github.com/NervanaSystems/coach/blob/master/rl_coach/exploration_policies/parameter_noise.py), by Intel AI, but it isn't easily accessible by the user and is intended for use in the creation of **their** RL algorithms.
- Was it part of tf.contrib?: No

**Which API type would this fall under (layer, metric, optimizer, etc.):** Layer

**Who will benefit with this feature?** Anyone who wants to use noisy nets for exploration in their RL projects. It is also a requirement for many popular RL algorithms like [Rainbow DQNs](https://arxiv.org/pdf/1710.02298.pdf).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Noisy Networks for Exploration #2127

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Noisy Networks for Exploration #2127

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions