[OP] broadcast_mask for rnn #3016

antinucleon · 2016-08-13T00:29:16Z

No description provided.

antinucleon · 2016-08-13T16:33:03Z

New LSTM unit example:

def lstm(num_hidden, indata, mask, prev_state, param, seqidx, layeridx, dropout=0.):
    """LSTM Unit symbol"""
    i2h = mx.sym.FullyConnected(data=indata,
                                weight=param.i2h_weight,
                                bias=param.i2h_bias,
                                num_hidden=num_hidden * 4,
                                name="t%d_l%d_i2h" % (seqidx, layeridx))
    h2h = mx.sym.FullyConnected(data=prev_state.h,
                                weight=param.h2h_weight,
                                bias=param.h2h_bias,
                                num_hidden=num_hidden * 4,
                                name="t%d_l%d_h2h" % (seqidx, layeridx))
    gates = i2h + h2h
    slice_gates = mx.sym.SliceChannel(gates, num_outputs=4,
                                      name="t%d_l%d_slice" % (seqidx, layeridx))
    in_gate = mx.sym.Activation(slice_gates[0], act_type="sigmoid")
    in_transform = mx.sym.Activation(slice_gates[1], act_type="tanh")
    forget_gate = mx.sym.Activation(slice_gates[2], act_type="sigmoid")
    out_gate = mx.sym.Activation(slice_gates[3], act_type="sigmoid")
    next_c = (forget_gate * prev_state.c) + (in_gate * in_transform)
    next_h = out_gate * mx.sym.Activation(next_c, act_type="tanh")
    # dropout the hidden h
    next_h = mx.sym.Dropout(next_h, p=dropout)
    # mask out the output
    next_c = mx.sym.element_mask(next_c, mask)
    next_h = mx.sym.element_mask(next_h, mask)
    return LSTMState(c=next_c, h=next_h)

Do you think it is enough?

freddycct · 2016-08-13T21:03:34Z

@antinucleon cool, you fixed it fast, I just pulled this changes. I will check it soon and let you know...
Thanks!

freddycct · 2016-08-14T19:49:40Z

@antinucleon I am trying out today, and possibly tomorrow too. How do I pass in the mask variables, do I feed it from my DataIter?

antinucleon · 2016-08-14T19:50:33Z

Yes
On Sun, Aug 14, 2016 at 12:49 Freddy Chua notifications@github.com wrote:

@antinucleon https://github.com/antinucleon I am trying out today, and
possibly tomorrow too. How do I pass in the mask variables, do I feed it
from my DataIter?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3016 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABM13vW95jv9oT2lSGz5HpgMbKGx7CJ2ks5qf3FXgaJpZM4Jjj4e
.

Sent from mobile phone

freddycct · 2016-08-14T21:19:38Z

@antinucleon I am not sure how to feed from dataiter, is it through provide_data or provide_label? Because the errors I am getting is not very informative...

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-41-47e7eca4e178> in <module>()
      3     eval_metric = mx.metric.np(perplexity, use_ignore=True, ignore_label=num_labels),
      4     batch_end_callback = [ mx.callback.Speedometer(batch_size, frequent=10) ],
----> 5     epoch_end_callback = [ mx.callback.do_checkpoint( '%s/%s' % (params_dir, expt_name) ) ]
      6 )

C:\Users\chuaf\AppData\Local\Continuum\Miniconda3\lib\site-packages\mxnet-0.7.0-py3.5.egg\mxnet\model.py in fit(self, X, y, eval_data, eval_metric, epoch_end_callback, batch_end_callback, kvstore, logger, work_load_list, monitor, eval_batch_end_callback)
    744 
    745         arg_names, param_names, aux_names = \
--> 746                 self._init_params(dict(data.provide_data+data.provide_label))
    747 
    748         # setup metric

C:\Users\chuaf\AppData\Local\Continuum\Miniconda3\lib\site-packages\mxnet-0.7.0-py3.5.egg\mxnet\model.py in _init_params(self, input_shapes, overwrite)
    485         """Initialize weight parameters and auxiliary states"""
    486         arg_shapes, _, aux_shapes = self.symbol.infer_shape(**input_shapes)
--> 487         assert(arg_shapes is not None)
    488 
    489         arg_names = self.symbol.list_arguments()

AssertionError:

freddycct · 2016-08-14T21:22:10Z

This is how I use your lstm in my lstm_unroll

    data  = mx.sym.Variable('data')
    label = mx.sym.Variable('label')
    mask  = mx.sym.Variable('mask')

    # (batch, time, vec) so axis 1 is the time step

    embed = mx.sym.Embedding(
        data=data, input_dim=num_labels,
        weight=embed_weight, output_dim=num_hidden, name='embed'
    )
    wordvec = mx.sym.SliceChannel(data=embed, num_outputs=enc_len + dec_len, squeeze_axis=1)
    masks   = mx.sym.SliceChannel(data=mask,  num_outputs=enc_len + dec_len, squeeze_axis=1)

    hidden_all = []
    for seqidx in range(enc_len + dec_len):
        hidden = wordvec[seqidx]
        mask_in = masks[seqidx]

        # stack LSTM
        for i in range(num_lstm_layer):
            dp = 0.0 if i == 0 else dropout

            # encoder RNN
            next_state = lstm(
                num_hidden,
                indata     = hidden,
                mask       = mask_in,
                prev_state = last_states[i],
                param      = enc_param_cells[i] if seqidx < enc_len else dec_param_cells[i],
                seqidx     = seqidx,
                layeridx   = i,
                dropout    = dp
            )

antinucleon · 2016-08-14T21:25:00Z

I am writing a data iter sample now, please wait a moment...

antinucleon · 2016-08-15T04:38:50Z

@freddycct

Here is a shitty working example without raising error. But need more work to polish it and verify correctness.

https://gist.github.com/antinucleon/c4ff26032f3a97f6aaf89680dfabe291

freddycct · 2016-08-15T04:40:27Z

@antinucleon Thanks, I will take a look tomorrow...

freddycct · 2016-08-16T00:08:19Z

basically, mask is a vector similar in length and shape to data, but it uses 0 for padding and 1 for non-padding inputs, and it is passed through the dataiter using provide_data,

i am currently modifying my dataiter for this, hang on..

pluskid · 2016-08-16T17:04:14Z

Great! This is fast!

antinucleon · 2016-08-16T17:11:59Z

@freddycct I find the bug in the shitty prototype iterator. Please let me know your experiment result.

freddycct · 2016-08-16T22:18:51Z

@antinucleon Ok, the masking layer works fine for me after calling model.fit, I also noticed that the Embedding vectors representing the PAD symbol is not changing, that means the gradient is not backpropagated to the embedding layer, and that is great,

next steps, I need to test the RNN inference part, so hang on, i am continuing to test... but I hope this gives you some confidence that your code is working..

freddycct · 2016-08-17T02:44:12Z

@antinucleon It works. My sequence to sequence is working, thanks!

broadcast_mask for rnn

d8a12c9

antinucleon changed the title ~~broadcast_mask for rnn~~ [OP] broadcast_mask for rnn Aug 13, 2016

antinucleon merged commit d08d87f into apache:master Aug 15, 2016

freddycct mentioned this pull request Sep 9, 2016

how to ignore padded data when using bucketing in RNN #2527

Closed

magic282 mentioned this pull request Sep 28, 2016

Mask for RNN symbol #3390

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OP] broadcast_mask for rnn #3016

[OP] broadcast_mask for rnn #3016

antinucleon commented Aug 13, 2016

antinucleon commented Aug 13, 2016

freddycct commented Aug 13, 2016

freddycct commented Aug 14, 2016

antinucleon commented Aug 14, 2016

freddycct commented Aug 14, 2016

freddycct commented Aug 14, 2016

antinucleon commented Aug 14, 2016

antinucleon commented Aug 15, 2016

freddycct commented Aug 15, 2016

freddycct commented Aug 16, 2016

pluskid commented Aug 16, 2016

antinucleon commented Aug 16, 2016

freddycct commented Aug 16, 2016 •

edited

Loading

freddycct commented Aug 17, 2016

[OP] broadcast_mask for rnn #3016

[OP] broadcast_mask for rnn #3016

Conversation

antinucleon commented Aug 13, 2016

antinucleon commented Aug 13, 2016

freddycct commented Aug 13, 2016

freddycct commented Aug 14, 2016

antinucleon commented Aug 14, 2016

freddycct commented Aug 14, 2016

freddycct commented Aug 14, 2016

antinucleon commented Aug 14, 2016

antinucleon commented Aug 15, 2016

freddycct commented Aug 15, 2016

freddycct commented Aug 16, 2016

pluskid commented Aug 16, 2016

antinucleon commented Aug 16, 2016

freddycct commented Aug 16, 2016 • edited Loading

freddycct commented Aug 17, 2016

freddycct commented Aug 16, 2016 •

edited

Loading