Approximate mode video decoding should tolerate one of duration or num_frames missing

The Python `VideoDecoder` needs to know how many frames are in a stream. It asserts this at startup:
https://github.com/pytorch/torchcodec/blob/dd44f57180070572eee9e7d35d38f9c03569f6f9/src/torchcodec/decoders/_video_decoder.py#L387-L390
It needs to know this because the number of frames are the length of the decoder as a sequence:
https://github.com/pytorch/torchcodec/blob/dd44f57180070572eee9e7d35d38f9c03569f6f9/src/torchcodec/decoders/_video_decoder.py#L122-L123
Some videos are missing the number of frames in their metadata. For exact mode, this is not a problem because we compute the number of frames when we scan the stream. For approximate mode, we currently cannot instantiate a decoder for such videos. Approximate mode also requires the FPS - that's how we figure out indices. So for a video that is missing its number of frames, if it has its FPS and time duration, we should be able to compute the number of frames.

In terms of implementation, we should hide all of this inside of the metadata class rather than changing any logic in `VideoDecoder`. We already make `num_frames` a property and figure out which is the best value to return:
https://github.com/pytorch/torchcodec/blob/dd44f57180070572eee9e7d35d38f9c03569f6f9/src/torchcodec/_core/_metadata.py#L125-L133
We should also figure out if `num_frames_from_header` is `None`, and if it is, return the calculation. Doing this change is very straight-forward, but we'll need testing to make sure we don't run into segfaults.

	def num_frames(self) -> Optional[int]:
	"""Number of frames in the stream. This corresponds to
	``num_frames_from_content`` if a :term:`scan` was made, otherwise it
	corresponds to ``num_frames_from_header``.
	"""
	if self.num_frames_from_content is not None:
	return self.num_frames_from_content
	else:
	return self.num_frames_from_header

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Approximate mode video decoding should tolerate one of duration or num_frames missing #727

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if metadata.num_frames is None:
	raise ValueError(
	"The number of frames is unknown. " + ERROR_REPORTING_INSTRUCTIONS
	)

	def __len__(self) -> int:
	return self._num_frames

Approximate mode video decoding should tolerate one of duration or num_frames missing #727

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions