Skip to content

Noisy Networks for Exploration #2127

Closed
@LeonShams

Description

@LeonShams

Describe the feature and the current behavior/state.
Noisy Networks are a relatively recent advancement in Reinforcement Learning (RL). Normally we use epsilon (probability of random action) to add noise after the forward pass of a network. Epsilon forces the agent to explore new areas and discover new rewards. This problem is that we have to decay this over time otherwise the agent is going to make random actions the entire game and never test out its skills. And if the decay curve isn't great the agent may either never learn the environment due to lack of exploration or take ages to learn due to excess exploration. The solution to this problem is noisy networks where the idea is that we add parameter noise to every dense layer in the agent's neural network to make it explore. But this parameter noise is controlled through gradient descent, so as the agent improves it will decay the parameter noise and control the exploration for us. This exploration is much more efficient because it only decreases as the agent learns, and it makes life easier for the developer since there are fewer parameters to tweak now that epsilon decay is no longer a problem.

To create a noisy layer all we need to do is take a normal dense layer and add some weighted noise to the normal weights. As the agent improves it will decrease the noise's weights leaving the normal weights with little or no change.

image

Relevant information

  • Are you willing to contribute it: Yes
  • Are you willing to maintain it going forward?: Yes
  • Is there a relevant academic paper?: Yes, Noisy Networks for Exploration, from Deepmind.
  • Is there already an implementation in another framework?: Yes, it has been implemented in some RL frameworks like Coach, by Intel AI, but it isn't easily accessible by the user and is intended for use in the creation of their RL algorithms.
  • Was it part of tf.contrib?: No

Which API type would this fall under (layer, metric, optimizer, etc.): Layer

Who will benefit with this feature? Anyone who wants to use noisy nets for exploration in their RL projects. It is also a requirement for many popular RL algorithms like Rainbow DQNs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions