Installation | Simple Example | Detailed Example | Documentation | Contributing | License
TorchCodec is a Python package with a goal to provide useful and fast APIs to decode video frames to PyTorch Tensors.
Note
Here's a condensed summary of what you can do with TorchCodec. For a more detailed example, check out our documentation!
from torchcodec.decoders import SimpleVideoDecoder
decoder = SimpleVideoDecoder("path/to/video.mp4")
decoder.metadata
# VideoStreamMetadata:
# num_frames: 250
# duration_seconds: 10.0
# bit_rate: 31315.0
# codec: h264
# average_fps: 25.0
# ... (truncated output)
len(decoder) # == decoder.metadata.num_frames!
# 250
decoder.metadata.average_fps # Note: instantaneous fps can be higher or lower
# 25.0
# Simple Indexing API
decoder[0] # uint8 tensor of shape [C, H, W]
decoder[0 : -1 : 20] # uint8 stacked tensor of shape [N, C, H, W]
# Iterate over frames:
for frame in decoder:
pass
# Indexing, with PTS and duration info
decoder.get_frame_at(len(decoder) - 1)
# Frame:
# data (shape): torch.Size([3, 400, 640])
# pts_seconds: 9.960000038146973
# duration_seconds: 0.03999999910593033
decoder.get_frames_at(start=10, stop=30, step=5)
# FrameBatch:
# data (shape): torch.Size([4, 3, 400, 640])
# pts_seconds: tensor([0.4000, 0.6000, 0.8000, 1.0000])
# duration_seconds: tensor([0.0400, 0.0400, 0.0400, 0.0400])
# Time-based indexing with PTS and duration info
decoder.get_frame_displayed_at(pts_seconds=2)
# Frame:
# data (shape): torch.Size([3, 400, 640])
# pts_seconds: 2.0
# duration_seconds: 0.03999999910593033
You can use the following snippet to generate a video with FFmpeg and tryout TorchCodec:
fontfile=/usr/share/fonts/dejavu-sans-mono-fonts/DejaVuSansMono-Bold.ttf
output_video_file=/tmp/output_video.mp4
ffmpeg -f lavfi -i \
color=size=640x400:duration=10:rate=25:color=blue \
-vf "drawtext=fontfile=${fontfile}:fontsize=30:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2:text='Frame %{frame_num}'" \
${output_video_file}
We'll be providing wheels in the coming days so that you can just install
torchcodec using pip
. For now, you can just build from source. You will need
the following dependencies:
- A C++ compiler+linker. This is typically available on a baseline Linux installation already.
- cmake
- pkg-config
- FFmpeg
- PyTorch nightly
Start by installing PyTorch following the official instructions.
Then, the easiest way to install the rest of the dependencies is to run:
conda install cmake pkg-config ffmpeg -c conda-forge
To clone and install the repo, run:
git clone git@github.com:pytorch/torchcodec.git
# Or, using https instead of ssh: git clone https://github.com/pytorch/torchcodec.git
cd torchcodec
pip install -e ".[dev]" --no-build-isolation -vv
TorchCodec supports all major FFmpeg version in [4, 7].
We are actively working on the following features:
- Ship wheels for Linux, so that Linux users can
pip install torchcodec
. - Ship wheels for MacOS, so
that MacOS users can
pip install torchcodec
. - GPU decoding
- Audio decoding
Let us know if you have any feature requests by opening an issue!
We welcome contributions to TorchCodec! Please see our contributing guide for more details.
TorchCodec is released under the BSD 3 license.