You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is the list of feature requests for StreamReader/Writer I have received so far.
Feel free to add
PTS support in StreamWriter Support overwriting PTS in StreamWriter #3135
When processing videos/audios, with StreamReader/Writer, the timestamp information (PTS) is lost. We need a way to provide frame-level PTS to Writer.
Audio passthrough
When performing batch video processing (such as super resolution), audio can be kept untouched. By allowing StreamReader to return packet data without decoding, and allowing StreamWriter to re-mux the said data, the video processing becomes more efficient.
Custom YUV to RGB conversion
Currently when using HW acceleration, only YUV outputs are supported. Filters like scale_cuda and scale_npp are supported via Support CUDA frame in FilterGraph #3183, but they don't provide YUV->RGB conversion either. We can implement a custom CUDA kernel like the example from Nvidia's CUDA example
Decoder/Encoder caching
Currently, StreamReader/Writer creates decoder/encoder objects for each input files. In large-scale video decoding situation, if the input formats are known to be same, we might be able to reuse decoders/encoders.
Apply filter function
FFmpeg has a lot of filtering functions. Similar to sox_effects, we should be able to apply these filters to Tensors.
See stab for Apply filter function #3161
Apply codecs function
This should be doable on Python layer, but having a function to apply codecs should be handy. We should replace the existing sox-based apply_codec function, and extend it to video/images.
Packet loss emulations
By dropping some packets in encoder/decoder, one can degrade the media. This could be used as a way of augmentation. The following is a PoC from my prototype using Gilbert-Elliott packet loss model
video.mp4
The text was updated successfully, but these errors were encountered:
Here is the list of feature requests for StreamReader/Writer I have received so far.
Feel free to add
When processing videos/audios, with StreamReader/Writer, the timestamp information (PTS) is lost. We need a way to provide frame-level PTS to Writer.
bitrate, gop size etc ....
When performing batch video processing (such as super resolution), audio can be kept untouched. By allowing StreamReader to return packet data without decoding, and allowing StreamWriter to re-mux the said data, the video processing becomes more efficient.
Currently when using HW acceleration, only YUV outputs are supported. Filters like
scale_cuda
andscale_npp
are supported via Support CUDA frame in FilterGraph #3183, but they don't provide YUV->RGB conversion either. We can implement a custom CUDA kernel like the example from Nvidia's CUDA exampleReduce memory usageSee Reduce GPU memory consumption #3165
Other ideas
Currently, StreamReader/Writer creates decoder/encoder objects for each input files. In large-scale video decoding situation, if the input formats are known to be same, we might be able to reuse decoders/encoders.
FFmpeg has a lot of filtering functions. Similar to sox_effects, we should be able to apply these filters to Tensors.
See stab for Apply filter function #3161
This should be doable on Python layer, but having a function to apply codecs should be handy. We should replace the existing sox-based apply_codec function, and extend it to video/images.
By dropping some packets in encoder/decoder, one can degrade the media. This could be used as a way of augmentation. The following is a PoC from my prototype using Gilbert-Elliott packet loss model
video.mp4
The text was updated successfully, but these errors were encountered: