Skip to content

apply_chunking_to_forward should only be the same in the chunking dimension #8349

Closed
@pedrocolon93

Description

@pedrocolon93

Environment info

  • transformers version: 3.4.0
  • Platform: All
  • Python version:
  • PyTorch version (GPU?):
  • Tensorflow version (GPU?):
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help

Information

Model I am using (Bert, XLNet ...): XLNet

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. Simply send in 2 tensor to the apply_chunking_to_forward that have the same batch length, same batch size, but different dimensionality and it will pop up with an exception

Expected behavior

Should only chunk if they are the same in the chunk dimension

 assert len(input_tensors) > 0, "{} has to be a tuple/list of tensors".format(input_tensors)
    tensor_shape = input_tensors[0].shape
    assert all(
        input_tensor.shape == tensor_shape for input_tensor in input_tensors
    ), "All input tenors have to be of the same shape"

Should be:

    tensor_shape = input_tensors[0].shape[chunk_dim]
    assert all(
        input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors
    ), "All input tenors have to be of the same shape"

In here if there are 2 input tensors with the shapes:
[512,2,768] and [512,2,300] the method throws an exception when it should only chunk based on the chunk dimension (in this case 2).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions