Make inner transform activation configurable for LSTMCell #10957

mrkumar83 · 2018-05-15T21:45:35Z

Description

Some papers recommend using sigmoid for the inner activation gate for an LSTM.

Other frameworks such as tensorflow alllow this:
https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/BasicLSTMCell
where they have an activation parameter.

Wanted to provide something similar in MXNet.

Checklist

Essentials

To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Added Parameter to LSTMCell class for inner transform activation.

…to mrajk-dev

piiswrong · 2018-05-16T17:12:20Z

python/mxnet/gluon/rnn/rnn_cell.py

@@ -441,6 +441,9 @@ class LSTMCell(HybridRecurrentCell):
    params : Parameter or None
        Container for weight sharing between cells.
        Created if `None`.
+    in_transform_activation_type : str


This name is too verbose. They are usually called activation and recurrent activation.
activation is applied to both input and next c

…to mrajk-dev

mrkumar83 · 2018-05-17T02:12:37Z

Added parameters, recurrent_activation and activation.

chinakook · 2018-05-19T09:49:25Z

But the F.activation only support 4 types of activation functions. Many other activation functions (with parameters) cannot pass like tensorflow with string, such as elu/selu/prelu/leakyrelu/hard_sigmoid etc.

piiswrong · 2018-05-21T18:26:00Z

@szha

szha · 2018-05-21T18:51:39Z

Will take a look shortly. Maybe it's worth having a utility function that wraps all activation types in the most efficient way (e.g. F.tanh instead of F.activation(act_type='tanh')) so that it can be reused everywhere.

szha · 2018-05-22T03:48:13Z

python/mxnet/gluon/rnn/rnn_cell.py

@@ -473,6 +480,9 @@ def __init__(self, hidden_size,
        self.h2h_bias = self.params.get('h2h_bias', shape=(4*hidden_size,),
                                        init=h2h_bias_initializer,
                                        allow_deferred_init=True)
+        self.activation = activation
+        self.recurrent_activation = recurrent_activation


_activation, _recurrent_activation

szha · 2018-05-22T03:48:57Z

python/mxnet/gluon/rnn/rnn_cell.py

+            F.Activation(slice_gates[1], act_type=self.recurrent_activation, name=prefix+'f')
+        in_transform = F.Activation(
+            slice_gates[2], act_type=self.activation, name=prefix+'c')
+        out_gate = F.Activation(slice_gates[3], act_type=self.recurrent_activation, name=prefix+'o')


use _get_activation

szha · 2018-05-22T05:06:00Z

python/mxnet/gluon/rnn/rnn_cell.py

@@ -255,8 +256,7 @@ def _get_activation(self, F, inputs, activation, **kwargs):
            return F.Activation(inputs, act_type=activation, **kwargs)


for string type, map the string to the most efficient operator. for example, if the string is 'tanh', instead of doing F.Activation(act_type='tanh'), do F.tanh, which doesn't require parsing the string at each call.

piiswrong · 2018-06-04T17:52:13Z

@mrkumar83 @szha Any updates?

@szha if original author doesn't respond could you take this over.

mrkumar83 · 2018-06-04T17:54:23Z

@piiswrong
Should have an update pr in the next 2 days

szha · 2018-06-05T03:25:21Z

python/mxnet/gluon/rnn/rnn_cell.py

+        elif activation == 'sigmoid':
+            return F.sigmoid(inputs, **kwargs)
+        elif activation == 'relu':
+            return F.relu(inputs, **kwargs)


add softsign

* Make inner activation gate configurable for LSTMCell * Adding pr feedback * Adding a recurrent_activation and activation similar to Keras * Fixing all pylint issues in the file * Adding initial pr feedback * Adding cr feedback * Adding softsign support

Make inner activation gate configurable for LSTMCell

c1d2a65

mrkumar83 requested a review from szha as a code owner May 15, 2018 21:45

Kumar added 2 commits May 15, 2018 15:01

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

ba5d28d

…to mrajk-dev

Adding pr feedback

c891893

piiswrong reviewed May 16, 2018

View reviewed changes

Kumar added 2 commits May 16, 2018 18:36

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

33b7a73

…to mrajk-dev

Adding a recurrent_activation and activation similar to Keras

30ac147

Fixing all pylint issues in the file

97b8b38

szha reviewed May 22, 2018

View reviewed changes

Adding initial pr feedback

45ace0f

Adding cr feedback

7d81a3f

szha reviewed Jun 5, 2018

View reviewed changes

Adding softsign support

3732233

szha merged commit 776b239 into apache:master Jun 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make inner transform activation configurable for LSTMCell #10957

Make inner transform activation configurable for LSTMCell #10957

mrkumar83 commented May 15, 2018

piiswrong May 16, 2018

mrkumar83 commented May 17, 2018

chinakook commented May 19, 2018

piiswrong commented May 21, 2018

szha commented May 21, 2018

szha May 22, 2018

szha May 22, 2018

szha May 22, 2018

piiswrong commented Jun 4, 2018

mrkumar83 commented Jun 4, 2018

szha Jun 5, 2018

		@@ -255,8 +256,7 @@ def _get_activation(self, F, inputs, activation, **kwargs):
		return F.Activation(inputs, act_type=activation, **kwargs)

Make inner transform activation configurable for LSTMCell #10957

Make inner transform activation configurable for LSTMCell #10957

Conversation

mrkumar83 commented May 15, 2018

Description

Checklist

Essentials

Changes

piiswrong May 16, 2018

Choose a reason for hiding this comment

mrkumar83 commented May 17, 2018

chinakook commented May 19, 2018

piiswrong commented May 21, 2018

szha commented May 21, 2018

szha May 22, 2018

Choose a reason for hiding this comment

szha May 22, 2018

Choose a reason for hiding this comment

szha May 22, 2018

Choose a reason for hiding this comment

piiswrong commented Jun 4, 2018

mrkumar83 commented Jun 4, 2018

szha Jun 5, 2018

Choose a reason for hiding this comment