Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended Presentation API Investigation #2869

Open
cwfitzgerald opened this issue Jul 10, 2022 · 14 comments
Open

Extended Presentation API Investigation #2869

cwfitzgerald opened this issue Jul 10, 2022 · 14 comments
Labels
api: dx12 Issues with DX12 or DXGI api: metal Issues with Metal api: vulkan Issues with Vulkan area: wsi Issues with swapchain management or windowing type: enhancement New feature or request

Comments

@cwfitzgerald
Copy link
Member

cwfitzgerald commented Jul 10, 2022

Context

I'm working on frame pacing and we need some help from the api. The difficulty of designing this api is each WSI has different pieces of information and give it to us in different ways.

Supersedes #682
Supersedes #2650

Investigation

We have the following major WSIs to think about:

  • IDXGISwapchain (Windows 7+ - D3D)
  • IPresentationManager (Windows 11+ - D3D)
  • CAMetalLayer (Mac - Metal)
  • VK_GOOGLE_display_timing (Vulkan - Android)
  • VK_KHR_present_wait (Vulkan - Nvidia)
  • VK_KHR_incremental_present (Mainly Mesa/Android)
  • VK_KHR_swapchain (All Vulkan)

And we have the following primatives:

  • Get Present Start/End Time
  • Wait for Present Finish
  • Present with Damage
  • Schedule Present Time
  • Primary Monitor Frequency
Present Time Wait for Present Present with Damage Scheduled Present Monitor Frequency
IDXGISwapchain 🆗 (1a) 🆗 (2) ✅ (3) 🆗 (4)
IPresentationManager ✅ (1b)
CAMetalLayer ✅ (1c) ✅ (5)
VK_GOOGLE_display_timing
VK_KHR_present_wait
VK_KHR_incremental_present
VK_KHR_swapchain

Notes:
1a. Presentation times need to be queried actively, it doesn't get told to us.
1b. Presentation times are given through an event queue.
1c. Presentation times are given through callbacks.
2. Can only wait for 1-3 frames ago, not a particular frame.
3. Windows 8+/Windows 7 Platform Update
4. You can schedule presentation for N vblanks from now.
5. Via NSScreen - need to figure out how to get NSScreen from metal layer.

Because of the diversity of the platforms, I think this will inherently be a leaky abstraction - this is okay - we shouldn't try to hide platform differences, just make it as easy to use as possible.

As such I have put together the following api.

Api Suggestion

Feature

First is to add a single Feature.

const EXTENDED_PRESENTATION_FEATURES = ...;

Presentation Features

Add an extended presentation capabilities bitflag that is queryable from the surface. I am separating this from regular features because they are more useful as default-on. Having the single feature means that users have to consciously enable it, but without needing to individually modulate them.

fn Surface::get_extended_presentation_features(&self, &Adapter) -> ExtendedPresentationFeatures;

bitflags! {
    // Names bikeshedable
    struct ExtendedPresentationFeatures {
        const PRESENT_STATISTICS = 1 << 0;
        const MONITOR_STATISTICS = 1 << 1;
        const WAIT_FOR_PRESENTATION = 1 << 2;
        const PRESENT_DAMAGE_REGION = 1 << 3;
        const PRESENT_DAMAGE_SCOLL = 1 << 4;
        const PRESENT_TIME = 1 << 5;
        const PRESENT_VBLANK_COUNT = 1 << 6;
    }
}

Presentation Signature

The presentation signature will be changed to the following.

fn Surface::present(desc: PresentationDescriptor<'a>);

#[derive(Default)] // Normal presentations will be PresentationDescriptor::default()
struct PresentationDescriptor<'a> {
    // Must be zero-length if PRESENT_DAMAGE_REGION is not true
    rects: &'a [Rect],
    // Must be None if PRESENT_DAMAGE_SCOLL is not true
    scroll: Option<PresentationScoll>,
    // Must be NoDelay if PRESENT_TIME or PRESENT_VBLANK_COUNT if not true
    presentation_delay: PresentationDelay,
}

struct PresentationScroll {
    source_rect: Rect,
    offset: Vec2,
}

struct Rect {
    offset: Vec2,
    size: Vec2,
}

enum PresentationDelay {
    // Queue the frame immediately. 
    NoDelay,
    // Queue the frame for N vblanks from now (must be between 1 and 4). Needs PRESENT_VBLANK_COUNT.
    ScheduleVblank(u8)
    // Queue the frame for presentation at the given time. Needs PRESENT_TIME.
    ScheduleTime(PresentationTime)
}

Presentation Timestamp

Because different apis use different timestamps - we need a way of correlating these timestamps with various other clocks. The clocks used are as follows on each WSI:

WSI Clock
IDXGISwapchain QueryPerformanceCounter
IPresentationManager QueryInterruptTimePrecise
CAMetalLayer mach_absolute_time
VK_GOOGLE_display_timing clock_gettime(CLOCK_MONOTONIC)

Add the following function to the surface.

fn Surface::correlate_presentation_timestamp<F, T>(&self, &Adapter, F)  -> (PresentationTimestamp, T) where FnOnce() -> T;

// Unit: nanoseconds
struct PresentationTimestamp(pub u64);

Which will let people write the following code to correlate instants and presentation timestamps. We need this because Instants need to be treated as completely opaque as the clock they use can change at any time. In most cases these are actually the same clock, but this is what we get.

let (present_timestamp, now) = surface.correlate_presentation_timestamp(&adapter, Instance::now);

Presentation Statistics

Because of the difference in how all the apis query stats, we need to abstract this carefully. We use a query-based "presentation statistics queue".

  • CAMetalLayer: Callbacks will save the time into a queue, which is emptied every time it is queried.
  • IPresentationManager: Calling the query function drains the statistics queue.
  • IDXGI: Query calls GetPresentationStatistics and returns a single value.
  • VK_GOOGLE_present_timing: Calls vkGetPastPresentationTimingGOOGLE which drains the queue.
fn Surface::query_presentation_statistics(&self, &Device) -> Vec<PresentationStatistics>;

struct PresentationStatistics {
    presentation_start: PresentationTimestamp,
    // Only available on IPresentationManager
    presentation_end: Option<PresentationTimestamp>,
    // Only available on VK_GOOGLE_display_timing
    earliest_present_time: Option<PresentationTimestamp>,
    // Only available on VK_GOOGLE_display_timing
    presentation_margin: Option<PresentationTimestamp>,
    composition_type: CompositionType,
}

enum CompositionType {
    // CAMetalLayer is always Composed
    Composed,
    Independent,
    // Vulkan, DXGI is always unknown
    Unknown,
}

Presentation Wait

First add the following member to SurfaceConfiguration:

// Requires WAIT_FOR_PRESENTATION and must be between 1 and 2.
maximum_latency: Option<u8>

This adjusts either the swapchain frame count to value + 1 - or sets SetMaximumFrameLatency to the value given, or uses a wait-for-present in the acquire method to limit rendering such that it acts like it's a value + 1 swapchain frame set.

Monitor Information

Getting exact frequencies of monitors is important for pacing - they can be derived from presentation stats, but an explicit api is more precise if it is available.

fn Surface::query_monitor_statistics(&self, &Device) -> MonitorStatistics;

struct MonitorStatistics {
    // In nanoseconds
    min_refresh_interval: u64,
    max_refresh_interval: u64,
    // On available on CAMetalLayer
    display_update_granularity: u64,
}

Conclusion

This is obviously one hell of an api change, and this doesn't have to happen all at once, but this investigating should give us the place to discuss the changes and make sure it provides the information needed.

@cwfitzgerald cwfitzgerald added type: enhancement New feature or request api: dx12 Issues with DX12 or DXGI api: metal Issues with Metal api: dx11 api: vulkan Issues with Vulkan area: wsi Issues with swapchain management or windowing labels Jul 10, 2022
@i509VCB
Copy link
Contributor

i509VCB commented Jul 10, 2022

For EGL, the WSI can do present with damage if the EGL_KHR_swap_buffers_with_damage extension is supported.

@superdump
Copy link
Contributor

superdump commented Jul 11, 2022

Random thoughts incoming:

  • Does WSI mean windowing system integration? It's unfortunately not the most searchable abbreviation.
  • For PRESENT_STATISTICS I feel that the PRESENT is needed to differentiate from MONITOR_STATISTICS. However, I feel it should be at least PRESENTATION_STATISTICS.
  • MONITOR_STATISTICS reads as if 'monitor' means 'keep track of' as in verb noun. I first thought DISPLAY_STATISTICS but that has the same problem. SCREEN_STATISTICS is maybe a bit less problematic even though grammatically it could go either way. I feel like I want DISPLAY_REFRESH_STATISTICS to be the outcome. :)
  • There are scoll typos in places. Where does the 'scroll' term come from? Scroll-lock comes to mind but that's an old cathode ray tube (CRT) feature. From the structs it looks like it's for requesting which part of the screen would be actually updated with what portion of the provided frame. Is that correct?
  • It's naïvely surprising to me that Vulkan does not generally support presentation times / scheduling presentation when the others do
  • It feels like there must be a way on Windows to get information about display refresh rates and timings...
  • What is the difference between presentation_start and earliest_present_time?
  • What does presentation_margin mean?

Taking a step back and thinking about how one would want to use this - the presentation and display refresh statistics provide information that can be used to make some kind of estimation/prediction for frame pacing, and the presentation descriptor then allows making an attempt at controlling presentation of a given frame. I'd have to think through it more thoroughly to be able to figure out whether it's sufficient and ergonomic.

@cwfitzgerald
Copy link
Member Author

It feels like there must be a way on Windows to get information about display refresh rates and timings...

@superdump There is, it's a bit complicated, but should be implementable within winit. You can get very specific timing info about all your monitors. Now that winit exposes micro-hertz refresh it should be usable for pacing. We just also need to expose the precision of the hz measurement.

@badicsalex
Copy link

I'm not sure what the status here is, but I'd love to implement the VK_GOOGLE_display_timing version, if you can give some guidance.

@cwfitzgerald
Copy link
Member Author

@badicsalex Sorry this totally got lost in the information firehose. None of this (outside of getting cpu-side presentation timestamps) is implemented yet and we'd love help! Come on our matrix and chat, that'd probably be the easiest way to sync up.

@badicsalex
Copy link

@cwfitzgerald thanks for the answer. We've investigated the issue in detail since then, and it seems that VK_GOOGLE_display_timing wouldn't give us much over simply measuring the acquire times of a simple FIFO mode, so we didn't pursue that angle any further.

@jimblandy
Copy link
Member

I talked a bit with @DJMcNab at RustNL last week, and he said that partial present functionality was important to the Xilem team.

@marcpabst
Copy link

I'm still interested in this too, just haven't gotten around to properly look at this. I have a fork somewhere that implement a crude way of getting presentation information on Apple hardware, but I think the main problem here is finding out to integrate data from the different APIs into wgpu's data model. Having precise statistics about when presentation happened is super crucial for my use case and now that the rest of my project is somewhat shaping up, I might have another look.

On a side note, I'd also be interested in running a handler as soon as possible after frame presentation, but I'm not sure if that is even possible outside of Apple/Metal.

@MarijnS95
Copy link
Contributor

Does anyone know if IPresentationManager (from CompositionSwapchain) can be used with D3D12? All examples I've found thus far are for D3D11, and DXGI already seems to match in functionality.

Additionally, can we add DirectComposition to the list? The only thing it can do is query presentation statistics (DCOMPOSITION_FRAME_STATISTICS with most fields mapping to DXGI_FRAME_STATISTICS except for nextEstimatedFrameTime) and wait for the "compositor clock" or commit completion.

Scheduling animations is possible but I haven't found a function to schedule presents from a swapchain yet; perhaps because that can/should be done through the API of choice instead (i.e. directly on IDXGISwapChain).

@cwfitzgerald
Copy link
Member Author

Does anyone know if IPresentationManager (from CompositionSwapchain) can be used with D3D12?

Not directly. I'm using it with D3D12, and I'm using D3D11 to make the buffers and do a blit into the final image.

Additionally, can we add DirectComposition to the list? The only thing it can do is query presentation statistics (DCOMPOSITION_FRAME_STATISTICS with most fields mapping to DXGI_FRAME_STATISTICS except for nextEstimatedFrameTime) and wait for the "compositor clock" or commit completion.

Sure. We need to support dcomp anyway to get transparent windows on windows. IDXGIOutput also has WaitForVsync, but that's only available if you're on the adapter powering the monitor (or can figure out what monitor you're on)

Scheduling animations is possible but I haven't found a function to schedule presents from a swapchain yet

You can from IPresentationManager https://learn.microsoft.com/en-us/windows/win32/api/presentation/nf-presentation-ipresentationmanager-settargettime and https://learn.microsoft.com/en-us/windows/win32/api/presentation/nf-presentation-ipresentationmanager-setpreferredpresentduration. Seems to work well, and SetPreferredPresentDuration seems to interact in the expected way with VRR.

I have personally given up on trying to figure out when the next present will happen as there are way too many factors, and instead focused on presenting a smooth series of frames and with corresponding present times.

@MarijnS95
Copy link
Contributor

Not directly. I'm using it with D3D12, and I'm using D3D11 to make the buffers and do a blit into the final image.

Right, have you also tried calling CreatePresentationFactory() with ID3D12Device (and ID3D12Queue for consistency with another API) etc? They all return "no interface" for me, so it's likely unsupported without resorting to D3D11 as you're doing...

Additionally, can we add DirectComposition to the list? The only thing it can do is query presentation statistics (DCOMPOSITION_FRAME_STATISTICS with most fields mapping to DXGI_FRAME_STATISTICS except for nextEstimatedFrameTime) and wait for the "compositor clock" or commit completion.

Sure. We need to support dcomp anyway to get transparent windows on windows. IDXGIOutput also has WaitForVsync, but that's only available if you're on the adapter powering the monitor (or can figure out what monitor you're on)

I'll rely on you to edit that into your own post :)

Scheduling animations is possible but I haven't found a function to schedule presents from a swapchain yet

You can from IPresentationManager https://learn.microsoft.com/en-us/windows/win32/api/presentation/nf-presentation-ipresentationmanager-settargettime and https://learn.microsoft.com/en-us/windows/win32/api/presentation/nf-presentation-ipresentationmanager-setpreferredpresentduration. Seems to work well, and SetPreferredPresentDuration seems to interact in the expected way with VRR.

Sorry for not being clear - I meant by using DirectComposition exclusively (since CompositionSwapchain doesn't interact with D3D12). But in any case, it seems these APIs are intended to be used in conjunction, i.e. I think IPresentationManager::CreatePresentationSurface() is supposed to take a HANDLE from DCompositionCreateSurfaceHandle()?

I have personally given up on trying to figure out when the next present will happen as there are way too many factors, and instead focused on presenting a smooth series of frames and with corresponding present times.

And DCOMPOSITION_FRAME_STATISTICS::nextEstimatedFrameTime cannot help you with that?

@DJMcNab
Copy link
Contributor

DJMcNab commented Sep 13, 2024

Just to throw another wrench into this, since recentish Android API levels, you are somewhat expected to use ASurfaceTransaction for this kind of thing. That gives capabilities not otherwise exposed, including edit: translations and other compositor integration

However, this API I believe is incompatible with using an underlying Vulkan Swapchain object.
I think it could be compatible with the wgpu Surface API, if wgpu managed its own textures.

Currently, the things stopping experimenting with this externally are:

  1. Being able to enable the Vulkan device features (although this might be possible using the from_hal API)
  2. Getting a fence (or possibly binary semaphore, but the semantics are horribly unclear) from submission.

(Edited to add more content as my control key got stuck and submitted without intention)

@MarijnS95
Copy link
Contributor

MarijnS95 commented Sep 21, 2024

That gives capabilities not otherwise exposed, including

@DJMcNab this paragraph still looks unfinished (even though the gist of all the features you get are derived by following the link).

But yes, ASurfaceControl (only working for the "root surface" of NativeActivity since Android 15) with ASurfaceTransaction and VK_ANDROID_external_memory_android_hardware_buffer basically turn this into "Bring Your Own Shawpchain".

I've asked upstream (as a final drive-by question in https://issuetracker.google.com/issues/320706287) whether we can get access to the buffer creation and (de)queueing mechanisms that are already used internally by Android's Surface/Swapchain implementation, to not have to allocate these ourselves (either on Android or Vulkan, and importing them into the other).

@DJMcNab
Copy link
Contributor

DJMcNab commented Sep 21, 2024

That would probably be helpful; this whole thing is rather messy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: dx12 Issues with DX12 or DXGI api: metal Issues with Metal api: vulkan Issues with Vulkan area: wsi Issues with swapchain management or windowing type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants