List of feature requests received so far for StreamReader/Writer #3139

mthrok · 2023-03-02T19:30:14Z

Here is the list of feature requests for StreamReader/Writer I have received so far.
Feel free to add

PTS support in StreamWriter Support overwriting PTS in StreamWriter #3135
When processing videos/audios, with StreamReader/Writer, the timestamp information (PTS) is lost. We need a way to provide frame-level PTS to Writer.
Support encoding options in StreamWriter (Add EncodingConfig #3179)
bitrate, gop size etc ....
Audio passthrough
When performing batch video processing (such as super resolution), audio can be kept untouched. By allowing StreamReader to return packet data without decoding, and allowing StreamWriter to re-mux the said data, the video processing becomes more efficient.
Custom YUV to RGB conversion
Currently when using HW acceleration, only YUV outputs are supported. Filters like scale_cuda and scale_npp are supported via Support CUDA frame in FilterGraph #3183, but they don't provide YUV->RGB conversion either. We can implement a custom CUDA kernel like the example from Nvidia's CUDA example
filter complex support in StreamWriter (Add filter and filter_complex to StreamWriter #3063)
- Prerequisites:
  - Enable CUDA filter graph #3159 -> Support CUDA frame in FilterGraph #3183
  - filter support in StreamWriter Add additional filter graph option to StreamWriter #3194
~~Reduce memory usage~~
See Reduce GPU memory consumption #3165

Other ideas

Decoder/Encoder caching
Currently, StreamReader/Writer creates decoder/encoder objects for each input files. In large-scale video decoding situation, if the input formats are known to be same, we might be able to reuse decoders/encoders.
- Re-use HW device/frames context #3160
Apply filter function
FFmpeg has a lot of filtering functions. Similar to sox_effects, we should be able to apply these filters to Tensors.
See stab for Apply filter function #3161
Apply codecs function
This should be doable on Python layer, but having a function to apply codecs should be handy. We should replace the existing sox-based apply_codec function, and extend it to video/images.
Packet loss emulations
By dropping some packets in encoder/decoder, one can degrade the media. This could be used as a way of augmentation. The following is a PoC from my prototype using Gilbert-Elliott packet loss model

video.mp4

The text was updated successfully, but these errors were encountered:

nateanl added the triaged label Mar 7, 2023

mthrok added the module: IO label Mar 9, 2023

gtebbutt mentioned this issue Jun 17, 2024

NV12/YUV->RGB colour accuracy and CUDA #3799

Open

Provide feedback