-
Notifications
You must be signed in to change notification settings - Fork 15
Fix atomic_add validation by correcting expected values and enhancing test coverage #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
…dd test return capability Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
Signed-off-by: Muhammad Awad <MuhammadAbdelghaffar.Awad@amd.com>
Signed-off-by: Muhammad Awad <MuhammadAbdelghaffar.Awad@amd.com>
@copilot, I fixed the bugs in the test. Update the PR description now and don’t make any code changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes validation failures in the atomic_add benchmark by correcting expected test values and improving test coverage. The changes address issues with tensor value mismatches, JIT function inspection errors, and missing buffer reset logic.
Key changes:
- Simplified test expectations to focus on functional correctness rather than complex accumulated values
- Added proper dtype mapping utilities for torch data types
- Updated validation logic to check source buffer instead of removed result buffer
- Enhanced test coverage with separate bandwidth and correctness test functions
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
tests/examples/test_atomic_add_bench.py | New comprehensive test suite with separate bandwidth and correctness validation |
examples/common/utils.py | Added dtype mapping utilities for converting between torch dtypes and string representations |
examples/04_atomic_add/atomic_add_bench.py | Fixed validation logic, removed result_buffer parameter, and updated expected values |
Updated the PR description to reflect the current working state after the bug fixes. |
Problem
The
04_atomic_add
validation was failing due to incorrect validation logic:Wrong validation target: The validation was checking
result_buffer
which was never written to by the kernel, sinceatomic_add
returns the previous value but the kernel didn't store it anywhere.Incorrect expected values: The validation expected wrong values that didn't match the actual atomic operation behavior.
Root Cause
The atomic add operation modifies the target buffer by adding the specified value and returns the previous value at each memory location. The original implementation had:
result_buffer
instead of modifiedsource_buffer
Solution
1. Fixed Validation Target and Logic
Updated validation to check
source_buffer
(which gets modified by atomic_add) with correct expected values:2. Enhanced Test Coverage
Separated performance and correctness testing with comprehensive test suite:
3. Improved Error Handling
Added proper validation error reporting and early exit on failure:
4. Clean Return Interface
Updated to return both bandwidth and buffer state for testing:
Testing
test_atomic_bandwidth()
verifies positive bandwidth measurements across different data types and configurationstest_atomic_correctness()
validates that atomic operations produce expected uniform valuestorch_dtype_to_str()
from common utilities for consistent datatype handlingExpected Behavior
When validation is enabled (
--validate
flag):source_buffer
against expected uniform valuesThe fix correctly addresses the validation logic while maintaining the original benchmark behavior and providing comprehensive test coverage for both performance and correctness.
Original prompt
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.