Skip to content

Bug when stacking LSTM #8255

Open
Open
@kzay

Description

@kzay

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 11
  • TensorFlow.js installed from (npm or script link): NPM
  • TensorFlow.js version (use command below): 4.18.0

Describe the current behavior

When creating a model stacking two LSTM ran into an error

Describe the expected behavior

  • Input data can be dummy to reproduce the probleme.
  • When removing the second LSTM everything is working.

Error during model training: Argument tensors passed to stack must be a Tensor[]orTensorLike[]

__________________________________________________________________________________________
Layer (type)                Input Shape               Output shape              Param #
==========================================================================================
lstm_1 (LSTM)               [[null,1,8]]              [null,1,100]              43600
__________________________________________________________________________________________
dropout_1 (Dropout)         [[null,1,100]]            [null,1,100]              0
__________________________________________________________________________________________
bidirectional_lstm (Bidirec [[null,1,100]]            [null,100]                160800
__________________________________________________________________________________________
dropout_2 (Dropout)         [[null,100]]              [null,100]                0
__________________________________________________________________________________________
dense_1 (Dense)             [[null,100]]              [null,1]                  101
==========================================================================================
Total params: 204501
Trainable params: 204501
Non-trainable params: 0
__________________________________________________________________________________________

Standalone code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/CodePen/any notebook.

 const model = tf.sequential();
        model.add(
            tf.layers.lstm({
                units: 100,
                inputShape: [1, 8], // Flexible time steps, defined number of features
                returnSequences: true,
                kernelInitializer: "glorotUniform", // For the input kernel
                recurrentInitializer: "orthogonal", // Especially good for RNNs
                name: "lstm_1",
            })
        );

        model.add(tf.layers.dropout({ rate: 0.3, name: "dropout_1" }));
        model.add(
            tf.layers.bidirectional({
                layer: tf.layers.lstm({
                    units: 100,
                    returnSequences: false,
                    kernelInitializer: "glorotUniform",
                    recurrentInitializer: "orthogonal",
                    name: "lstm_2",
                }),
                name: "bidirectional_lstm",
                mergeMode: "ave",
            })
        );
        model.add(tf.layers.dropout({ rate: 0.3, name: "dropout_2" }));

        model.add(
            tf.layers.dense({
                units: 1,
                activation: "sigmoid",
                kernelInitializer: "glorotUniform",
                name: "dense_1",
            })
        );
        model.compile({
            optimizer: "adam",
            loss: "binaryCrossentropy",
            metrics: ["accuracy"],
        });

Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions