Closed
Description
System Info
transformers 4.52.3
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
The utility function transformers.video_utils.group_videos_by_shape
fails to handle videos with the same image shape but varying length.
Example:
import torch
from transformers.video_utils import group_videos_by_shape
video_1 = torch.zeros((4, 3, 336, 336))
video_2 = torch.zeros((5, 3, 336, 336))
grouped_videos, grouped_videos_index = group_videos_by_shape([video_1, video_2])
Discovered in vllm-project/vllm#18678
Error log: https://buildkite.com/vllm/fastcheck/builds/25100/steps?jid=0197076f-fbbf-45c4-968f-6d6f154f4af9
Expected behavior
The videos should be grouped by the full shape, not just shape[-2::]