Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter and filter_complex to StreamWriter #3063

Open
maysteinfeld opened this issue Feb 15, 2023 · 7 comments
Open

Add filter and filter_complex to StreamWriter #3063

maysteinfeld opened this issue Feb 15, 2023 · 7 comments

Comments

@maysteinfeld
Copy link

maysteinfeld commented Feb 15, 2023

🚀 The feature

Add the capability for ffmpeg filters (-filter, -filter_complex) in StreamWriter and StreamReader according to ffmpeg filters: https://ffmpeg.org/ffmpeg-filters.html

It'll be good to add an argument to set the ffmpeg filters via add_video_stream and add_audio_stream when working with hw acceleration and get entire ffmpeg functionality with acceleration.

Motivation, pitch

Working with ffmpeg filters is used for a lot of basic utilities like adding effects via an overlay, changing background color etc.
Using the StreamWriter and then applying the filter losses all the acceleration gains.
Combing the ability to add the filters in the StreamWriter will solve this issue and will let the user the ability to create advanced ffmpeg pipelines while benefiting from the Hardware acceleration.

Thanks

Alternatives

No response

Additional context

No response

@maysteinfeld maysteinfeld changed the title Add filter_complex to StreamWriter Add filter and filter_complex to StreamWriter Feb 15, 2023
@mthrok
Copy link
Collaborator

mthrok commented Feb 16, 2023

This is an interesting one.

During the development of StreamReader/StreamWriter, support for the filiter_complex complicated the interface, so I excluded them, thinking that, theoretically one can manually perform pixel-level transformation on PyTorch.

This is technically challenging.

On the low level implementation, the question is how to integrate the filter graph, which is a mapping from multiple AVFrame*s to one AVFrame*.

On the surface level, it is the question about what is a good interface for specifying multiple input tensors. (i.e. what's the good API?)

StreamReader

        ┌► AVFrame ──► Tensor
source ─┤
        └► AVFrame ──► Tensor

StreamWriter

Tensor ──► AVFrame ─┐
                    ├─► destination
Tensor ──► AVFrame ─┘

FilterComplex

AVFrame ─┐   ┌──────┐
         ├─► │filter│ ──► AVFrame
AVFrame ─┘   └──────┘

Let's say we want to achieve the following patter, where we pass two tensors and perform overlay and encode the resulting frame.

StreamWriter

Tensor ──► AVFrame ─┐   ┌──────┐
                    ├─► │filter│ ──► AVFrame
Tensor ──► AVFrame ─┘   └──────┘

and we want to do something like

s = StreamWriter(...)
s.add_video_stream(...)  # stream0
s.add_video_stream(...)  # stream1
s.DEFINE_OVERLAY(stream0, stream1)

What we are missing is that

  • a way to tell StreamReader that the calls to add_video_stream should not be connected to decoder
  • a way to tell StreamReader to define a new stream from already defined streams.

Another idea is to have the filtering op as separate class like sox_effects. This is already do-able for simple filters, but the support for complex has to be added on C++ level. Also this approach will incur more data copy than necessary at the boundaries of AVFrame and Tensors.

FilterComplex

Tensor ──► AVFrame ─┐   ┌──────┐
                    ├─► │filter│ ──► AVFrame ──► Tensor
Tensor ──► AVFrame ─┘   └──────┘

There is also feature request to allow audio pass through from StreamReader to StreamWriter without decoding/encoding, which will avoid unnecessary Tensor/AVFrame conversion, which could be applied here, but that's still in exploration phase.

@mthrok
Copy link
Collaborator

mthrok commented Feb 16, 2023

Another question is overlay and other filters support CUDA frames?

Update: They seem to do 😮 https://github.com/FFmpeg/FFmpeg/blob/aeceefa6220ccb8eac625f78c6fa90d048ccd2de/libavfilter/vf_overlay_cuda.c#L568

If the scope is limited to overlay, it seems the main operation is two-lines here, so it should be easy to achieve the same effect on PyTorch.

https://github.com/FFmpeg/FFmpeg/blob/aeceefa6220ccb8eac625f78c6fa90d048ccd2de/libavfilter/vf_overlay_cuda.cu#L45-L50

@mthrok
Copy link
Collaborator

mthrok commented Feb 16, 2023

Also is any of you help design and implement this?
My bandwidth is very limited, so even though I find this interesting work, I don't know if I can work on it.

@maysteinfeld
Copy link
Author

Hi @mthrok ,
thanks for your help and fast answer, I really appreciate this.
I don't have a budget to work on that for now, and it's not the main focus of our task.
But I really hope it will be implemented in the future.
I wanted also to know if you know when the nightly changes will be promoted to the stable version in terms of the stream writer.
Thanks

@mthrok
Copy link
Collaborator

mthrok commented Apr 4, 2023

Update: We have added filter support to StreamWriter #3194. Two remaining items to complete this feature are

  1. Support CUDA filter in StreamWriter
  2. design multiple input stream

1 should be possible by attaching HWFramesContext to filter graph in StreamWriter.
2 still needs design

@maysteinfeld
Copy link
Author

Great thanks, in which version it will be release? @mthrok

@xiaohui-zhang
Copy link
Contributor

it'll be in the next release (around 3 months or so). it's available in nightly build right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants