MSRA weight filler #1946

nickcarlevaris · 2015-02-23T10:33:38Z

This PR adds MSRAFiller, which implements an Xavier-like filler designed for use with ReLUs instead of tanh, based on the paper: He et al, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," 2015.

It also adds a VarianceNorm option to FillerParameters which allows one to normalize by fan_in, fan_out or their average. VarianceNorm applies to the MSRAFiller and the XavierFiller (default behavior unchanged). It also adds tests for MSRAFiller and XavierFiller.

Replaces #1883 (updates based on that discussion and rebased against master).

Like the XavierFiller, the fan_in and fan_out dimensions are not correct for inner product layers (as pointed out by @seanbell in #1883). However, I did update the documentation to note this.

… use with ReLUs instead of tanh. Based on paper: He et al, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," 2015. Added VarianceNorm option to FillerParameters which allows one to normalize by fan_in, fan_out or their average. Updated XavierFiller to use the VarianceNorm option (default behavior unchanged). Added tests for MSRAFiller and XavierFiller.

seanbell · 2015-02-28T20:55:05Z

Note that #1970 should fix the fan_in and fan_out calculations for InnerProductLayer since the weights will now be 2D with shape output x input.

shelhamer · 2015-03-07T06:27:37Z

include/caffe/filler.hpp

+ * scale] where scale = sqrt(3 / n) where n is the fan_in, fan_out, or their
+ * average, depending on the variance_norm option. You should make sure the
+ * input blob has shape (num, a, b, c) where a * b * c = fan_in and num * b * c
+ * = fan_out. Note that this is currently not the case for inner product layers.


#1970 is in so this filler is now right for InnerProduct layers too.

shelhamer · 2015-03-07T06:47:48Z

@nickcarlevaris thanks -- this looks good. The only potential issue is naming and attribution. I am not certain but if I understand correctly the same sqrt(2) gain may have been suggested by Andrew Saxe et al. through derivations in http://arxiv.org/abs/1312.6120v3. Although I suggested "MSRA" earlier, I think a citation to both and a functional name is perhaps best.

@nickcarlevaris you suggested "ReLU" since this is intended for use with the so-named nonlinearity. It could be this is the right choice.

@longjon ?

futurely · 2015-04-09T11:57:14Z

#1940 has been merged for a month. Can these two work together to reproduce the paper's results?

omgteam · 2015-05-21T07:35:55Z

This issue has been open for a long time. Hope it merged quickly.

omgteam · 2015-05-23T05:54:30Z

Why hasn't this been merged into master? anything wrong?

Add MSRAFiller, an Xavier-like filler designed for use with ReLUs

shelhamer · 2015-05-27T00:25:58Z

Merged to master in c255709. Thanks @nickcarlevaris!

I did a manual merge to re-format the commit message and add my own commit to note potentially related work. Closing since my edit threw off the github merge.

happynear · 2015-06-10T02:44:32Z

Why there is no parameter to specify the \alpha defined in Equation 15?
Since PReLU layer has been added to Caffe, I think we should also introduce this parameter into the filler.

nickcarlevaris mentioned this pull request Feb 23, 2015

MSR weight filler #1883

Closed

ducha-aiki mentioned this pull request Feb 27, 2015

Implement Microsoft Parametric ReLU and corresponding initialization method to surpass human on ImageNet classification task #1991

Closed

shelhamer added enhancement JL labels Mar 7, 2015

shelhamer reviewed Mar 7, 2015
View reviewed changes

shelhamer added a commit that referenced this pull request May 27, 2015

Merge pull request #1946 from nickcarlevaris/msra_init

c255709

Add MSRAFiller, an Xavier-like filler designed for use with ReLUs

shelhamer closed this May 27, 2015

vchuravy mentioned this pull request Nov 18, 2015

Update Xavier apache/mxnet#610

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSRA weight filler #1946

MSRA weight filler #1946

nickcarlevaris commented Feb 23, 2015

seanbell commented Feb 28, 2015

shelhamer Mar 7, 2015

shelhamer commented Mar 7, 2015

futurely commented Apr 9, 2015

omgteam commented May 21, 2015

omgteam commented May 23, 2015

shelhamer commented May 27, 2015

happynear commented Jun 10, 2015

MSRA weight filler #1946

MSRA weight filler #1946

Conversation

nickcarlevaris commented Feb 23, 2015

seanbell commented Feb 28, 2015

shelhamer Mar 7, 2015

Choose a reason for hiding this comment

shelhamer commented Mar 7, 2015

futurely commented Apr 9, 2015

omgteam commented May 21, 2015

omgteam commented May 23, 2015

shelhamer commented May 27, 2015

happynear commented Jun 10, 2015