PresentMon is a tool to capture and analyze ETW events related to swap chain presentation on Windows. It can be used to trace key performance metrics for graphics applications (e.g., CPU and Display frame durations and latencies) and works across different graphics APIs, different hardware configurations, and for both desktop and UWP applications.
While PresentMon itself is focused on lightweight collection and analysis, there are several other programs that build on its functionality and/or helps visualize the resulting data. For example, see
- CapFrameX
- FrameView
- OCAT
- PIX (used as part of its system monitor UI)
Copyright (C) 2017-2022 Intel Corporation
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Binaries for main release versions of PresentMon are provided on GitHub:
See CONTRIBUTING for information on how to request features, report issues, or contribute code changes.
Capture Target Options | |
---|---|
-captureall |
Record all processes (default). |
-process_name name |
Record only processes with the provided exe name. This argument can be repeated to capture multiple processes. |
-exclude name |
Don't record processes with the provided exe name. This argument can be repeated to exclude multiple processes. |
-process_id id |
Record only the process specified by ID. |
-etl_file path |
Consume events from an ETW log file instead of running processes. |
Output Options | |
---|---|
-output_file path |
Write CSV output to the provided path. |
-output_stdout |
Write CSV output to STDOUT. |
-multi_csv |
Create a separate CSV file for each captured process. |
-no_csv |
Do not create any output file. |
-no_top |
Don't display active swap chains in the console |
-qpc_time |
Output present time as a performance counter value. |
-qpc_time_s |
Output present time as a performance counter value converted to seconds. |
Recording Options | |
---|---|
-hotkey key |
Use provided key to start and stop recording, writing to a unique CSV file each time. 'key' is of the form MODIFIER+KEY, e.g., "alt+shift+f11". |
-delay seconds |
Wait for provided time before starting to record. If using -hotkey, the delay occurs each time recording is started. |
-timed seconds |
Stop recording after the provided amount of time. |
-exclude_dropped |
Exclude dropped presents from the csv output. |
-scroll_indicator |
Enable scroll lock while recording. |
-no_track_display |
Disable tracking through GPU and display. |
-track_debug |
Adds additional data to output not relevant to normal usage. |
Execution Options | |
---|---|
-session_name name |
Use the provided name to start a new realtime ETW session, instead of the default "PresentMon". This can be used to start multiple realtime captures at the same time (using distinct, case-insensitive names). A realtime PresentMon capture cannot start if there are any existing sessions with the same name. |
-stop_existing_session |
If a trace session with the same name is already running, stop the existing session (to allow this one to proceed). |
-terminate_existing |
Terminate any existing PresentMon realtime trace sessions, then exit. Use with -session_name to target particular sessions. |
-restart_as_admin |
If not running with elevated privilege, restart and request to be run as administrator. (See discussion above). |
-terminate_on_proc_exit |
Terminate PresentMon when all the target processes have exited. |
-terminate_after_timed |
When using -timed , terminate PresentMon after the timed capture completes. |
Beta Options | |
---|---|
-date_time |
Output present time as a date and time with nanosecond precision. |
-track_gpu |
Tracks the duration of each process' GPU work performed between presents. Not supported on Win7. |
-track_gpu_video |
Track the video encode/decode portion of GPU work separately from other engines. Not supported on Win7. |
-track_input |
Tracks the time of keyboard/mouse clicks that were used by each frame. |
-track_mixed_reality |
Capture Windows Mixed Reality data to a CSV file with "_WMR" suffix. |
By default, PresentMon creates a CSV file named "PresentMon-<Time>.csv", where "<Time>" is the creation time in ISO 8601 format. To specify your own output location, use the -output_file PATH
command line argument.
If -multi_csv
is used, then one CSV is created for each process captured and "-<ProcessName>-<ProcessId>" is appended to the file name.
If -hotkey
is used, then one CSV is created for each time recording is started and "-<Index>" is appended to the file name.
Column Header | Data Description |
---|---|
Application | The name of the process that called Present(). |
ProcessID | The process ID of the process that called Present(). |
SwapChainAddress | The address of the swap chain that was presented into. |
Runtime | The runtime used to present (e.g., D3D9 or DXGI). |
SyncInterval | The sync interval provided by the application in the Present() call. This value may be modified later by the driver, e.g., based on control panel overrides. |
PresentFlags | Flags used in the Present() call. |
PresentMode | The presentation mode used by the system for this Present(). See the table below for more details. This column is not available when -no_track_display is used. |
AllowsTearing | Whether tearing is possible (1) or not (0). This column is not available when -no_track_display is used. |
TimeInSeconds | The time of the Present() call, in seconds, relative to when the PresentMon started recording. |
QPCTime | The time of the Present() call, as a performance counter value. This column is only available when -qpc_time or -qpc_time_s are used. When -qpc_time_s is used, the value is converted to seconds by dividing by the counter frequency. |
msInPresentAPI | The time spent inside the Present() call, in milliseconds. |
msUntilRenderComplete | The time between the Present() call and when GPU work for this frame completed, in milliseconds. This column is not available when -no_track_display is used. |
msUntilDisplayed | The time between the Present() call and when the frame was displayed, in milliseconds. This column is not available when -no_track_display is used. |
Dropped | Whether the frame was dropped (1) or displayed (0). Note, if dropped, msUntilDisplayed will be 0. |
msBetweenPresents | The time between this Present() call and the previous one, in milliseconds. |
msBetweenDisplayChange | How long the previous frame was displayed before this Present() was displayed, in milliseconds. This column is not available when -no_track_display is used. |
WasBatched | Whether the frame was submitted by the driver on a different thread than the app (1) or not (0). This column is only available when -track_debug is used. |
DwmNotified | Whether the desktop compositor was notified about the frame (1) or not (0). This column is only available when -track_debug is used. |
Using -track_gpu
or -track_gpu_video
will add the following columns:
Column Header | Data Description |
---|---|
msUntilRenderStart | The time between the Present() call and when GPU work for this frame started, in milliseconds. Note that rendering for a frame can start before the Present() call, so this value can be negative. |
msGPUActive, msGPUVideoActive | The total duration the GPU was working on this frame, in milliseconds. Time is counted whenever at least one engine is executing work from the target process. When -track_gpu_video is used, then the msGPUVideoActive column is added showing the duration of work on the GPU's video encode and/or decode engines and, in this case, the video encode/decode work is not included in msGPUActive. |
Using -track_input
will add the following columns:
Column Header | Data Description |
---|---|
msSinceInput | The time between the Present() call and the earliest keyboard or mouse interaction that contributed to this frame. For frames where msSinceInput is non-zero, msSinceInput + msUntilDisplayed can be used as a measure of the latency between user input and the display of the resulting rendered frame. Note, however, that this is just the software-visible subset of the full input-to-photon latency and doesn't include:• time spent processing input in the keyboard/controller hardware or drivers (typically a fixed additional overhead), • time spent processing the output in the display hardware or drivers (typically a fixed additional overhead), and • a combination of display blanking interval and scan time (which varies, depending on timing and tearing). |
The following values are used in the PresentMode column:
PresentMode | Description |
---|---|
Hardware: Legacy Flip | Indicates the app took ownership of the screen, and is swapping the displayed surface every frame. |
Hardware: Legacy Copy to front buffer | Indicates the app took ownership of the screen, and is copying new contents to an already-on-screen surface every frame. |
Hardware: Independent Flip | Indicates the app does not have ownership of the screen, but is still swapping the displayed surface every frame. |
Composed: Flip | Indicates the app is windowed, is using "flip model" swapchains, and is sharing its surfaces with DWM to be composed. |
Hardware Composed: Independent Flip | Indicates the app is using "flip model" swapchains, and has been granted a hardware overlay plane. |
Composed: Copy with GPU GDI | Indicates the app is windowed, and is copying contents into a surface that's shared with GDI. |
Composed: Copy with CPU GDI | Indicates the app is windowed, and is copying contents into a dedicated DirectX window surface. GDI contents are stored separately, and are composed together with DX contents by the DWM. |
For more information on the performance implications of these, see:
- https://www.youtube.com/watch?v=E3wTajGZOsA
- https://software.intel.com/content/www/us/en/develop/articles/sample-application-for-direct3d-12-flip-model-swap-chains.html
Note: Windows Mixed Reality support is in beta, with limited OS support and maintenance.
If -track_mixed_reality
is used, a second CSV file will be generated with "_WMR" appended to the filename with the following columns:
Column Header | Data Description |
---|---|
Application | Process name (if known) |
ProcessID | Process ID |
DwmProcessID | Compositor Process ID |
TimeInSeconds | Time since PresentMon recording started |
msBetweenLsrs | Time between this Lsr CPU start and the previous one |
AppMissed | Whether Lsr is reprojecting a new (0) or old (1) App frame (App GPU work must complete before Lsr CPU start) |
LsrMissed | Whether Lsr displayed a new frame (0) or not (1+) at the intended V-Sync (Count V-Syncs with no display change) |
msAppPoseLatency | Time between App's pose sample and the intended mid-photon frame display |
msLsrPoseLatency | Time between Lsr's pose sample and the intended mid-photon frame display |
msActualLsrPoseLatency | Time between Lsr's pose sample and mid-photon frame display |
msTimeUntilVsync | Time between Lsr CPU start and the intended V-Sync |
msLsrThreadWakeupToGpuEnd | Time between Lsr CPU start and GPU work completion |
msLsrThreadWakeupError | Time between intended Lsr CPU start and Lsr CPU start |
msLsrPreemption | Time spent preempting the GPU with Lsr GPU work |
msLsrExecution | Time spent executing the Lsr GPU work |
msCopyPreemption | Time spent preempting the GPU with Lsr GPU cross-adapter copy work (if required) |
msCopyExecution | Time spent executing the Lsr GPU cross-adapter copy work (if required) |
msGpuEndToVsync | Time between Lsr GPU work completion and V-Sync |
msBetweenAppPresents | Time between App's present and the previous one. |
msAppPresentToLsr | Time between App's present and Lsr CPU start. This column is not available when -no_track_display is used. |
HolographicFrameID | App's Holographic Frame ID. This column is only available when -track_debug is used. |
msSourceReleaseFromRenderingToLsrAcquire | Time between composition end and Lsr acquire. This column is only available when -track_debug is used. |
msAppCpuRenderFrame | Time between App's CreateNextFrame() API call and PresentWithCurrentPrediction() API call. This column is only available when -track_debug is used. |
msAppMisprediction | Time between App's intended pose time and the intended mid-photon frame display. This column is only available when -track_debug is used. |
msLsrCpuRenderFrame | Time between Lsr CPU render start and GPU work submit. This column is only available when -track_debug is used. |
msLsrThreadWakeupToCpuRenderFrameStart | Time between Lsr CPU start and CPU render start. This column is only available when -track_debug is used. |
msCpuRenderFrameStartToHeadPoseCallbackStart | Time between Lsr CPU render start and pose sample. This column is only available when -track_debug is used. |
msGetHeadPose | Time between Lsr pose sample start and pose sample end. This column is only available when -track_debug is used. |
msHeadPoseCallbackStopToInputLatch | Time between Lsr pose sample end and input latch. This column is only available when -track_debug is used. |
msInputLatchToGpuSubmission | Time between Lsr input latch and GPU work submit. This column is only available when -track_debug is used. |
See GitHub Issues for a current list of reported issues.
PresentMon needs to be run by a user who is a member of the "Performance Log Users" user group, or to be run with administrator privilege. If neither of these are true, you will get an error "failed to start trace session (access denied)".
To add a user to the "Performance Log Users" user group:
- Run
compmgmt.msc
as administrator. - In the "Computer Management" window, expand "System Tools", expand "Local Users and Groups", and then click "Groups".
- Double-click "Performance Log Users", and then click "Add".
- In the "Enter the object names to select" text box, type the name of the user account or group account that you want to add, and then click "OK".
- Sign out and log back in for the changes to take effect.
If PresentMon is not run with administrator privilege, it will not have complete process information for processes running on different user accounts. Such processes will be listed in the console and CSV as "<error>", and they cannot be targeted by name (-process_name
).
Applications that do not use D3D9 or DXGI APIs for presenting frames (e.g., as is typical with OpenGL or Vulkan applications) will report the following:
- Runtime = Other
- SwapChainAddress = 0
- msInPresentAPI = 0
In this case, TimeInSeconds will represent the first time the present is observed in the kernel, as opposed to the runtime, and therefore will be sometime after the application presented the frame (typically ~0.5ms). Since msUntilRenderComplete and msUntilDisplayed are deltas from TimeInSeconds, they will be correspondingly smaller then they would have been if measured from application present. msBetweenDisplayChange will still be correct, and msBetweenPresents should be correct on average.
When using -track_gpu
on a system that uses Hardware-Accelerated GPU
Scheduling (HWS), the GPU execution metrics are less accurate than when HWS is
disabled resulting in msUntilRenderStart, msUntilRenderComplete,
msGPUActive, and msGPUVideoActive measurements that are later/larger than
they should be. For example, in a GPU-bound scenario the frame's msGPUActive
may be reported ~0.5ms larger than the true GPU work duration, though the
specific amount of the inaccuracy will be workload- and GPU-dependent.
An improved solution is WIP.
Some users have observed system stability issues when forcibly shutting down PresentMon on Windows 7. If you are having similar issues, they can be avoided by using Ctrl+C in the PresentMon window to shut it down.