Skip to content

Conversation

@simonpintarelli
Copy link
Member

Cherry picked from #154

dbsanfte and others added 2 commits October 20, 2025 11:13
The validation was using absolute error tolerance (1e-8) which fails for
large matrix multiplication results (magnitude ~1e4). This caused false
negatives where COSMA computed correct results but failed validation.

Changes:
- Switch from absolute error to relative error for validation
- Use 1e-5 tolerance for float32 (appropriate for single precision)
- Use 1e-8 tolerance for float64 (appropriate for double precision)
- Handle small values near zero with absolute error fallback

This fixes issue #153 where K-split strategy was incorrectly reported
as producing 93.6% errors when actual relative errors were < 1e-6.

Tested with:
- 32x896x896 float32: now passes (was 93.8% false errors)
- 32x10000x896 float32: now passes (was 93.6% false errors)
- 32x32x32 float64: still passes (regression test)
@simonpintarelli
Copy link
Member Author

cscs-ci run gh200

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants