Description
This issue had been migrated from tensorflow/tensorflow#5276.
Recently I had to fight a lot with exploding gradients and thereby weights turning NaN after only several iterations event hough I was using tf.contrib.layers.variance_scaling_initializer correctly. To monitor the problem I started saving weight histograms which seemed totally messed up. I thought this was due to me using the initializer wrongly and only after some debugging I realised my initialization was totally fine. The NaN values appearing in iteration n seem to mess with the visualization of values in previous iterations which seems like a very misleading behaviour to me. Histograms and distributions should be visualized correctly until this is not really possible because of ill values.
I can easily provide tensorflow summary data to reproduce this behaviour.