Skip to content

win32: Replace fprintf with proper VA_TRACE function #695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 9, 2023

Conversation

sivileri
Copy link
Contributor

@sivileri sivileri commented Mar 8, 2023

Closes #694

Tested with environment variable LIBVA_TRACE=1. Trace test when running vainfo:

[33752.134567][ctx       none]==========va_TraceInitialize
[33752.134574][ctx       none]==========	VA-API vendor string: Mesa Gallium driver 23.1.0-devel for D3D12 (NVIDIA GeForce RTX 4090)
[33752.134575][ctx       none]=========vaInitialize ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134679][ctx       none]=========vaQueryConfigProfiles ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134696][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134700][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134704][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134712][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134716][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134719][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134720][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134725][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.134726][ctx       none]=========vaQueryConfigEntrypoints ret = VA_STATUS_SUCCESS, success (no error) 
[33752.160574][ctx       none]==========va_TraceTerminate
[33752.160603][ctx       none]=========vaTerminate ret = VA_STATUS_SUCCESS, success (no error) 


@sivileri sivileri force-pushed the win32_debug_message_trace branch 2 times, most recently from 821eae9 to 0970432 Compare March 8, 2023 18:43
@sivileri
Copy link
Contributor Author

@evelikov
Copy link
Contributor

None of the other four (or six if we also count nvctl and fglrx) backends have any debug or info messages. There are some error messages, which are very hard to reach in reality.

@sivileri do we need the fprint/trace within the win32 driver in the first place? This PR adds additional API/ABI which varies across platforms - which is rarely a good sign.

@sivileri
Copy link
Contributor Author

None of the other four (or six if we also count nvctl and fglrx) backends have any debug or info messages. There are some error messages, which are very hard to reach in reality.

@sivileri do we need the fprint/trace within the win32 driver in the first place? This PR adds additional API/ABI which varies across platforms - which is rarely a good sign.

I was following what this existing backend is doing as reference to get the va_trace* functions available from the backend, based on the discussion on #694 to add a trace message instead of just printing.

VA_TRACE_LOG(va_TracePutSurface, dpy, surface, (void *)draw, srcx, srcy, srcw, srch,

Knowing which adapter LUID and associated driver was loaded is important in my opinion for tracing/debugging purposes, and looked like using the tracing library was the preferred method in this project to log events like such, but I'd be okay with printing instead if that's cleaner. To solve #694 then I'd surround the printing in the backend with #if DEBUG guards so it's not so noisy for apps like gstreamer.

@evelikov
Copy link
Contributor

That is tracing at the API level, since it is a platform specific API (which is bad on it's own right). The x11 backend does not print "found driver X", so the win32 shouldn't do either?

@sivileri
Copy link
Contributor Author

I still think it's important to have trace information about the adapter/driver in the traces, but agree maybe there's a better way to do this that's not backend specific and doesn't add cross library binary dependencies. vaQueryVendorString should provide this information.

I modified the approach of this PR to solve the issue, dropping the win32 backend print messages and now logging vaQueryVendorString (if available) from the driver, in va_TraceInitialize

@sivileri sivileri force-pushed the win32_debug_message_trace branch from 0970432 to 1bb80c9 Compare March 17, 2023 15:18

const char* vendor_string = vaQueryVendorString(dpy);
if (vendor_string)
va_TraceMsg(trace_ctx, "==========\tVA-API vendor string: %s\n", vendor_string);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this information is good to have in the trace, I still consider what you attempted to print in va_win32.c is also quite valid for debug. While adding backend specific stuff to va_trace and expose that in ABI is not desirable, I would suggest to consider exposing single backend agnostic function from va_trace to be able to print arbitrary trace messages from the backends. Something like vaTracePrintf().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvrogozh Yes, thanks for pointing that out. While having the vendor string should provide the adapter/vendor/driver name, it's not as exact as the adapter LUID provided by the caller app to vaGetDisplayWin32 and the associated adapter driver DLL name. On multi-adapter or multi-VA-driver platforms having this info is very useful to know how to attempt a repro, etc from a trace. I agree having such generic function all backends can use looks like a less ABI invasive approach and could help also printing extra messages in the backends in addition to the vendor string, for example the existing ones in win32 currently with fprintf.

@evelikov would a vaTracePrintf() backend-agnostic function address the concerns you mentioned about different API/ABI across platforms ?

Copy link
Contributor

@evelikov evelikov Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On multi-adapter or multi-VA-driver platforms having ...

How common are these - is that something we have today? If not perhaps we can defer this until it's needed?

Stepping back: we had multi adapter/platform setups on Linux for quite some time, where vainfo was ultimately used to gather base information as reporting tool. Is that not useful on Windows? Should be step back and rethink about something that scales?

Note that adding more internal ABI between the libva components has back-fired in the past. Where users will have mixed version of the libraries and they will get either missing symbols or silent corruption... Which is one of the reasons why I'm apprehensive about extending vaTrace as a whole.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty common to have an integrated GPU and a discrete GPU (like in laptops for example), and apps can select which GPU to use explicitly with VAAPI by passing the LUID* argument to vaGetDisplayWin32 (kind of like passing different /dev/dri/XYZ for multi GPU in DRM), multi-driver is also supported now and pending to be enhanced with #684.

vainfo can give info and details about a specific adapter/backend when you run it with --device and --display arguments, but you need to know which GPU/driver you want to query info about. Unfortunately that doesn't give you info on which adapter apps selected explicitly when calling vaGetDisplayWin32 in their backends with non-null LUID* arguments.

The goal of adding this info in libva itself in this PR is so the user knows "this app used GPU adapter X and driver Y when initializing VAAPI". Then that info can be used to debug further (ie. with vainfo).

Thanks for clarifying, I now see the issue with having mismatched library versions (they're usually packaged separately into libva-drm, libva-x11, etc and the user could get a mismatched state, right?).

If we stick with not changing the ABI to avoid that, perhaps this combination would work to reach a consensus? It doesn't extend any ABI and provides the info needed to further debug with other tools from the trace/debug output.

  • Tracing vaQueryString in va_TraceInitialize
  • Tracing the driver_name variable from vaInitialize also in va_TraceInitialize
  • Replacing the fprintf with OutputDebugStringA in the Win32 backend with to avoid undesired verbosity but allowing to have the info about the LUID and the registry driver value when running under a debugger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VAAPI backend and libva.so libraries are coming together in one bundle if they are installed after build from sources or via package manager which handles dependencies and updates libva.so if one of backends has changed. It's highly unlikely that something will go wrong with ABI between backends and libva.so. I don't say you can't create situation that everything will go wrong, but still unlikely from my perspective. ABI between libva.so and actual driver is another matter - that's indeed where we see issues.

I also think that having tracing support for backends is quite missing. Basically each time something goes wrong with backends you either need to use debugger and start adding prints to understand where error is. Having a trace with steps performed by backend will be quite useful.

Extending ABI for each particular backend however is problematic. Having single agnostic trace entry point seems useful. I vote for this solution.

Copy link
Contributor

@evelikov evelikov Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sivileri I'm roughly aware that multi-GPU setups are thing. The question was are there actual users using the non-default device? Do we have any other drivers (but vaon12) on the horizon?

@dvrogozh I was thinking the same thing - "unlikely from my perspective" ... only to look at the games ecosystem. Where vendors will pull some open-source libraries alongside their game, patch it (sometimes changing ABI) or even static link (while still exporting the symbols), ultimately causing symbol collision.

IMHO to properly unwrap the multi-device topic, VA should grow an API similar to EGLDevice (today, VA effectively mimics EGL) and even check the Vulkan PhysicalDevice API. That will allow for more flexible and consistent, experience both for VA API users as well as tracing/debugging of said code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evelikov yes, apps like ffmpeg/gstreamer/vainfo will use explicit LUID to select the adapter when initializing vaGetDisplayWin32, and gives the option to the user to select which "device index" in a multi-adapter platform, even with only vaon12 adding the logging for the actual adapter LUID would be great (as vaon12 supports any d3d12 adapter and later loads the vendor specific driver).

Thanks for clarifying, looks then like the ABI issue doesn't come from libva packaging/shipping itself but how some apps are shipping it alongside their packages. I agree that's something to have in mind to keep the ABI changes to a minimum and not add anything platform specific, but for particular changes like this one, as @dvrogozh mentioned above, then I think it'll be useful to have a generic tracing export function instead of having to print debug outside of VACore.

@sivileri sivileri force-pushed the win32_debug_message_trace branch 2 times, most recently from ef94737 to cea4ddf Compare May 30, 2023 12:27
Copy link
Contributor

@evelikov evelikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version prints the vaQueryVendorString() string during tracing and wraps the noisy fprintfs in a DEBUG.

This is a good stepping stone and we can improve further in the future. Thanks o/

@sivileri
Copy link
Contributor Author

sivileri commented Jun 8, 2023

@XinfengZhang could we please merge this one?

Copy link
Contributor

@dvrogozh dvrogozh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sivileri added 2 commits June 9, 2023 15:41
Signed-off-by: Sil Vilerino <sivileri@microsoft.com>
Closes intel#694

Signed-off-by: Sil Vilerino <sivileri@microsoft.com>
@XinfengZhang XinfengZhang force-pushed the win32_debug_message_trace branch from cea4ddf to 098ea4f Compare June 9, 2023 07:41
@XinfengZhang XinfengZhang merged commit 1c58941 into intel:master Jun 9, 2023
@sivileri sivileri deleted the win32_debug_message_trace branch March 27, 2024 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vaon12: VA runtime prints noisy messages
4 participants