Skip to content

warning msg/documentation on the tf32 related system flags and usage #6754

@wyli

Description

@wyli

(follow up of #6525) My larger concern is that other operations in monai will be also affected by the tf32 issue (since all operations uses cuda.matmul are affected). This may lead to significant reproducibility issues.

My proposal is adding something like
https://github.com/Lightning-AI/lightning/pull/16037/files#diff-909e246d6c36514f952ae5023bd9fbcc3e8f2c6a0837ebf81d7dc96790b5f938R190-R210
to related classes/functions in monai. Then, monai will print warnings when the flag is True. Not sure when it is better to print warnings, maybe during import? Maybe warnings can be suppressed when the flage is explicitly set by users, but it seems technically challenging.
&
adding a part in the documentation to educate users how to use tf32 properly.

Originally posted by @qingpeng9802 in #6525 (comment)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions