-
Notifications
You must be signed in to change notification settings - Fork 10
update to CPU fraction auto-tuning algorithm #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Added check to exclude operations with failed DSA calls from considering by the CPU fraction auto-tuning algorithm Added documentation on common usage modes for DTO. Signed-off-by: Sydir, Jerry <jerry.sydir@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refines the CPU fraction auto-tuning algorithm by filtering out failed DSA operations from its metrics, bumps the default minimum buffer size to 64 KB, and expands the README with common DTO usage scenarios.
- Exclude failed DSA completions from auto-tuning calculations.
- Change
DTO_DEFAULT_MIN_SIZE
from 16 384 to 65 536. - Add documented examples of latency, power, cycle, and cache-pollution usage modes.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
dto.c | Early-return on non-successful DSA completions in dsa_wait_and_adjust and update min size. |
README.md | New section detailing common DTO usage modes and their recommended environment variable settings. |
Comments suppressed due to low confidence (1)
dto.c:435
- Consider adding a unit or integration test to verify that failed DSA completions are properly excluded from the auto-tuning metrics.
if(*comp != DSA_COMP_SUCCESS) {
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
// operations that have failed (mostly due to page fault) return very quickly and cause the algorithm | ||
// to think that the DSA operation was faster than it really was. We exclude them from the calculation. | ||
if(*comp != DSA_COMP_SUCCESS) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: just fix the space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -44,7 +44,7 @@ | |||
*/ | |||
#define MAX_WQS 32 | |||
#define MAX_NUMA_NODES 32 | |||
#define DTO_DEFAULT_MIN_SIZE 16384 | |||
#define DTO_DEFAULT_MIN_SIZE 65536 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set it to 32K should we pick 48K as the middle? if not, I am fine with 64K
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets set it to 64K. There are cases where even 64K is marginal.
Signed-off-by: Sydir, Jerry <jerry.sydir@intel.com>
Added check to exclude operations with failed DSA calls from considering by the CPU fraction auto-tuning algorithm
Added documentation on common usage modes for DTO.
Modified Default Min size to 64K