Open
Description
The Python VideoDecoder
needs to know how many frames are in a stream. It asserts this at startup:
torchcodec/src/torchcodec/decoders/_video_decoder.py
Lines 387 to 390 in dd44f57
It needs to know this because the number of frames are the length of the decoder as a sequence:
torchcodec/src/torchcodec/decoders/_video_decoder.py
Lines 122 to 123 in dd44f57
Some videos are missing the number of frames in their metadata. For exact mode, this is not a problem because we compute the number of frames when we scan the stream. For approximate mode, we currently cannot instantiate a decoder for such videos. Approximate mode also requires the FPS - that's how we figure out indices. So for a video that is missing its number of frames, if it has its FPS and time duration, we should be able to compute the number of frames.
In terms of implementation, we should hide all of this inside of the metadata class rather than changing any logic in VideoDecoder
. We already make num_frames
a property and figure out which is the best value to return:
torchcodec/src/torchcodec/_core/_metadata.py
Lines 125 to 133 in dd44f57
We should also figure out if
num_frames_from_header
is None
, and if it is, return the calculation. Doing this change is very straight-forward, but we'll need testing to make sure we don't run into segfaults.