Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of feature requests received so far for StreamReader/Writer #3139

Open
7 of 8 tasks
mthrok opened this issue Mar 2, 2023 · 0 comments
Open
7 of 8 tasks

List of feature requests received so far for StreamReader/Writer #3139

mthrok opened this issue Mar 2, 2023 · 0 comments

Comments

@mthrok
Copy link
Collaborator

mthrok commented Mar 2, 2023

Here is the list of feature requests for StreamReader/Writer I have received so far.
Feel free to add

  1. PTS support in StreamWriter Support overwriting PTS in StreamWriter #3135
    When processing videos/audios, with StreamReader/Writer, the timestamp information (PTS) is lost. We need a way to provide frame-level PTS to Writer.
  2. Support encoding options in StreamWriter (Add EncodingConfig #3179)
    bitrate, gop size etc ....
  3. Audio passthrough
    When performing batch video processing (such as super resolution), audio can be kept untouched. By allowing StreamReader to return packet data without decoding, and allowing StreamWriter to re-mux the said data, the video processing becomes more efficient.
  4. Custom YUV to RGB conversion
    Currently when using HW acceleration, only YUV outputs are supported. Filters like scale_cuda and scale_npp are supported via Support CUDA frame in FilterGraph #3183, but they don't provide YUV->RGB conversion either. We can implement a custom CUDA kernel like the example from Nvidia's CUDA example
  5. filter complex support in StreamWriter (Add filter and filter_complex to StreamWriter #3063)
  6. Reduce memory usage
    See Reduce GPU memory consumption #3165

Other ideas

  • Decoder/Encoder caching
    Currently, StreamReader/Writer creates decoder/encoder objects for each input files. In large-scale video decoding situation, if the input formats are known to be same, we might be able to reuse decoders/encoders.
  • Apply filter function
    FFmpeg has a lot of filtering functions. Similar to sox_effects, we should be able to apply these filters to Tensors.
    See stab for Apply filter function #3161
  • Apply codecs function
    This should be doable on Python layer, but having a function to apply codecs should be handy. We should replace the existing sox-based apply_codec function, and extend it to video/images.
  • Packet loss emulations
    By dropping some packets in encoder/decoder, one can degrade the media. This could be used as a way of augmentation. The following is a PoC from my prototype using Gilbert-Elliott packet loss model
video.mp4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants