Right now we don't have audio decoding. We can implement a function that converts an AVFrame to an audio tensor here: https://github.com/pytorch/torchcodec/blob/9e3a2196d7361e86f02f5b12430477c3e872898c/src/torchcodec/decoders/_core/VideoDecoder.cpp#L773