Skip to content
This repository was archived by the owner on Nov 22, 2022. It is now read-only.

removed bilstm dependence on seq_lengths #776

Closed

Conversation

shreydesai
Copy link
Contributor

Summary:
If seq_lengths is not provided as a parameter to BiLSTM and pack_sequence is disabled, then the BiLSTM representation should not use seq_lengths to initialize anything. This diff removes the dependence on seq_lengths in said scenarios by using the batch dimension of embedded_tokens to initialize zero hidden state.

This seemingly minor change makes a dramatic difference when exporting models to caffe2 because seq_lengths will no longer be a part of the model graph. This can give both latency and memory savings during inference, albeit small.

Differential Revision: D16229675

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jul 12, 2019
Summary:
Pull Request resolved: facebookresearch#776

If seq_lengths is not provided as a parameter to BiLSTM and pack_sequence is disabled, then the BiLSTM representation should not use seq_lengths to initialize anything. This diff removes the dependence on seq_lengths in said scenarios by using the batch dimension of embedded_tokens to initialize zero hidden state.

This seemingly minor change makes a dramatic difference when exporting models to caffe2 because seq_lengths will no longer be a part of the model graph. This can give both latency and memory savings during inference, albeit small.

Differential Revision: D16229675

fbshipit-source-id: 9b64bad0391934b2819b26c97ca145e53a6da238
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8a1a455.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants