Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering zero copy AVFrames decoded via Vulkan #272

Open
gitoss opened this issue Jun 18, 2024 · 10 comments
Open

Rendering zero copy AVFrames decoded via Vulkan #272

gitoss opened this issue Jun 18, 2024 · 10 comments

Comments

@gitoss
Copy link

gitoss commented Jun 18, 2024

This issue is very similar to the nVidia-only ticket #237 - i.e. it's the same combination of hdr source and Vulkan decoding.

I'm on AMD Ryzen 7540U w/ RDNA3 740M, Driver 23.11.1 (newer Adrenaline versions have a horrible memory leak bug).

ffmpeg -hwaccel vulkan -an -i %INPUT% -vf "libplacebo=tonemapping=hable:format=nv12,hwdownload,format=nv12" -f null - -benchmark

This is significantly SLOWER on newer ffmpeg versions than on old ones. I tried to track when the change occurs - it was beween 2023-04-30 and 2023-05-31 becuse it works up to "ffmpeg version N-111869-g7aa71ab5c0-20230831'. https://github.com/BtbN/FFmpeg-Builds/releases

Vulkan decoding (i.e. zero copy) is 2x fps of software decoding. Hardware decode with dxva2/d3d11va w/ hwupload is only 1.5 fps of software decoding.

Btw adding '-hwaccel_output_format vulkan' is only acceped up to this older ffmpeg versions, on newer versions like 'ffmpeg version N-111869-g7aa71ab5c0-20230831' thse errors occur:

[libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_SAMPLEABLE
[libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_BLITTABLE
[libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_SAMPLEABLE
[libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_BLITTABLE
[libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_SAMPLEABLE
[libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_BLITTABLE
[libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_SAMPLEABLE
[libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_BLITTABLE
[libplacebo @ 000000001d8ee4c0] Validation failed: (image->planes[i]).texture->params.sampleable (src/renderer.c:2704)
[libplacebo @ 000000001d8ee4c0] Backtrace:
[libplacebo @ 000000001d8ee4c0] #0 0x7ff66c7a7b9f in FT_Set_Default_Log_Handler+0xaa9df (ffmpeg.exe+0x14e7b9f) (0x1414e7b9f)
[libplacebo @ 000000001d8ee4c0] #1 0x7ff66c7ab6c8 in pl_render_image+0xa8 (ffmpeg.exe+0x14eb6c8) (0x1414eb6c8)
[libplacebo @ 000000001d8ee4c0] #2 0x7ff66c7acb63 in pl_render_image_mix+0x1383 (ffmpeg.exe+0x14ecb63) (0x1414ecb63)
[libplacebo @ 000000001d8ee4c0] #3 0x7ff66b42173c (ffmpeg.exe+0x16173c) (0x14016173c)
[libplacebo @ 000000001d8ee4c0] #4 0x7ff66b42269a (ffmpeg.exe+0x16269a) (0x14016269a)
[libplacebo @ 000000001d8ee4c0] #5 0x7ff66b32888b (ffmpeg.exe+0x6888b) (0x14006888b)
[libplacebo @ 000000001d8ee4c0] #6 0x7ff66b32d62f (ffmpeg.exe+0x6d62f) (0x14006d62f)
[libplacebo @ 000000001d8ee4c0] #7 0x7ff66b2d338f (ffmpeg.exe+0x1338f) (0x14001338f)
[libplacebo @ 000000001d8ee4c0] #8 0x7ff66b2ec158 (ffmpeg.exe+0x2c158) (0x14002c158)
[libplacebo @ 000000001d8ee4c0] #9 0x7ff66cf924ca in FT_Get_PS_Font_Value+0x8229a (ffmpeg.exe+0x1cd24ca) (0x141cd24ca)
[libplacebo @ 000000001d8ee4c0] #10 0x7ff83eefe633 in beginthreadex+0x133 (C:\Windows\System32\msvcrt.dll+0x3e633) (0x11013e633)
[libplacebo @ 000000001d8ee4c0] #11 0x7ff83eefe70b in endthreadex+0xab (C:\Windows\System32\msvcrt.dll+0x3e70b) (0x11013e70b)
[libplacebo @ 000000001d8ee4c0] #12 0x7ff83e19257c in BaseThreadInitThunk+0x1c (C:\Windows\System32\KERNEL32.DLL+0x1257c) (0x18001257c)
[libplacebo @ 000000001d8ee4c0] #13 0x7ff83ff8aa47 in RtlUserThreadStart+0x27 (C:\Windows\SYSTEM32\ntdll.dll+0x5aa47) (0x18005aa47)

@haasn
Copy link
Owner

haasn commented Jun 18, 2024

cc @cyanreg is there a way to control whether or not to use planar formats?

@gitoss
Copy link
Author

gitoss commented Jun 18, 2024

For what its worth, I've even tried to get dxva2 and d3dv11va hw decoding to work w/o cpu path to libplacebo - both fail (-extra_hw_frames doesn't help). The new d3d12va which is supposed to be more zero-copy ready is completely broken on my system w/ current ffmpeg.

ffmpeg -init_hw_device "vulkan=vk:0" -filter_hw_device vk -hwaccel d3d11va -hwaccel_output_format d3d11 -an -i %INPUT% -vf "hwmap=derive_device=vk,format=vulkan,libplacebo=tonemapping=hable:format=nv12,hwdownload,format=nv12" -f null - -benchmark

It seems the only way to go for zero copy with libplacebo seems to be Vulkan decoding.

@cyanreg
Copy link
Contributor

cyanreg commented Jun 18, 2024

I saw a lot of text, but no issue stated. What is wrong?

@haasn -init_hw_device "vulkan=vk:0,disable_multiplane=1"

@gitoss
Copy link
Author

gitoss commented Jun 20, 2024

I saw a lot of text, but no issue stated. What is wrong?

@haasn -init_hw_device "vulkan=vk:0,disable_multiplane=1"

"This is significantly SLOWER on newer ffmpeg versions than on old ones."

@cyanreg
Copy link
Contributor

cyanreg commented Jun 20, 2024

Vulkan decoding wasn't even merged for the command line you posted to work.

@gitoss
Copy link
Author

gitoss commented Jun 20, 2024

Vulkan decoding wasn't even merged for the command line you posted to work.

Could you please elaborate?

The command line I specified is simplified from what I use in libplacebo tonemapping, but I comared it in current and an old ffmpeg version (versions in the op) - and there's a signficiant speed difference.

This seems to be something not ony I expericiencd reading the issues here around the time Vukan ffmpeg moved to 1.3

@streetpea
Copy link

streetpea commented Sep 3, 2024

@gitoss are you saying there is just a black screen in the case when you use the amd driver with Windows and HDR enabled using Vulkan (i.e., -hwaccel_output_format vulkan on newer ffmpeg versions such as 7.0)? That's what happens for a user of my program here: streetpea/chiaki-ng#393 (comment) which is using libplacebo vulkan rendering and ffmpeg 7.0. Interestingly, the amd based Steam Deck works fine with the Linux driver using vulkan hw decode and hdr with ffmpeg 7.0

@gitoss
Copy link
Author

gitoss commented Sep 4, 2024

@gitoss are you saying there is just a black screen in the case when you use the amd driver with Windows and HDR enabled using Vulkan (i.e., -hwaccel_output_format vulkan on newer ffmpeg versions such as 7.0)?

I didn't use HDR, just SDR encoding - both versions old and new work and produce the same output, but newer versions are slower which is significant considering libplacebo processing is very slow already.

@streetpea
Copy link

I see @gitoss i thought you were talking about hdr because of the issue title including hdr. Maybe you should change it to say sdr as I'm seeing a problem with hdr where it displays only a black screen using Vulkan hw decoding on windows but I guess it should be a different issue

@gitoss gitoss changed the title Rendering zero copy HDR AVFrames decoded via Vulkan Rendering zero copy AVFrames decoded via Vulkan Sep 5, 2024
@gitoss
Copy link
Author

gitoss commented Sep 5, 2024

Sorry for the confusion - I simple "re-reponed" a closed former issue by 1:1 copy of the title. I've changed the title.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants