Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate failure of ElementwiseReluModule and ResNet18StaticModule TOSA tests #1231

Closed
ashay opened this issue Aug 16, 2022 · 10 comments
Closed
Assignees

Comments

@ashay
Copy link
Collaborator

ashay commented Aug 16, 2022

Both ElementwiseReluModule and ResNet18StaticModule TOSA tests fail after updating LLVM. See CI log for details.

    FAIL - "ElementwiseReluModule_basic"
        @ trace item #0 - call to "forward"
        @ output of call to "forward"
        ERROR: value (Tensor with shape=[4, 2], dtype=torch.float32, min=+3.403e+38, max=+3.403e+38, mean=+3.403e+38) is not close to golden value (Tensor with shape=[4, 2], dtype=torch.float32, min=+0.0, max=+0.3964, mean=+0.09984)

    FAIL - "ResNet18StaticModule_basic"
        @ trace item #0 - call to "forward"
        @ output of call to "forward"
        ERROR: value (Tensor with shape=[1, 1000], dtype=torch.float32, min=+nan, max=+nan, mean=+nan) is not close to golden value (Tensor with shape=[1, 1000], dtype=torch.float32, min=-1.815, max=+2.328, mean=+0.03187)
@ashay ashay self-assigned this Aug 16, 2022
ashay added a commit that referenced this issue Aug 16, 2022
Summary of changes:
 - Tensor dialect now sets `emitAccessorPrefix` to prefixed, thus
   requring updates to methods that retrieve arguments
   [https://reviews.llvm.org/D131361]
 - Update MHLO to build with LLVM commit hash 2dde4ba
 - Replace `AbsOp` with `AbsFOp` [https://reviews.llvm.org/D131325]
 - Replace deprecated `getValue()` with `value()`
   [https://reviews.llvm.org/D131349]
 - Remove `AnalysisState::defaultInitialize()`
   [https://reviews.llvm.org/D131746]
 - Update MHLO MLIR tests to use the updated assembly format
 - Disabled two failing TOSA tests (Github Issue link:
   #1231)
@sjarus
Copy link
Collaborator

sjarus commented Aug 16, 2022

Without looking at anything in detail, this sounds like something might have changed in TosaToLinAlg/LinAlgNamed/Arith that generates functionally wrong output, since the TOSA content is unchanged here.

@silvasean
Copy link
Contributor

@sjarus do you have someone on your side that can investigate this?

@sjarus
Copy link
Collaborator

sjarus commented Aug 16, 2022

Adding @eric-k256 and @u99127, as well as @jpienaar who has been reviewing recent TosaToXXX PRs . I think this might just be a git bisect away from catching which TosaToXX change broke the functional correctness of these models.

I'm traveling over the next 24hrs so I don't have the ability to look right now unfortunately.

@jpienaar
Copy link
Member

So this failed after the last LLVM update? Could you add the before and after rev here? (ReluN was removed 4 days ago upstream, and llvm/llvm-project@444b4fd seems likely candidate fix given replacement)

@eric-k256
Copy link
Collaborator

I can confirm that llvm/llvm-project@444b4fd fixes the problem.
I'm not sure what you do in torch-mlir. Does this wait until the next LLVM update, or should we try to move LLVM forward again?

@silvasean
Copy link
Contributor

This seems reasonable to wait until the next update -- the XFAILs will pass and we can update them and close this bug.

@eric-k256
Copy link
Collaborator

Sounds good to me.

@ashay
Copy link
Collaborator Author

ashay commented Aug 17, 2022

Thanks all for diagnosing this so quickly! Looking forward to next Monday when I can close this issue without doing any work ;)

@sjarus
Copy link
Collaborator

sjarus commented Aug 17, 2022

Thank you for your assistance, @jpienaar and @eric-k256 !

@ashay
Copy link
Collaborator Author

ashay commented Aug 22, 2022

Resolved in LLVM tag update PR #1263.

@ashay ashay closed this as completed Aug 22, 2022
ashay added a commit that referenced this issue Aug 30, 2022
 - Update MHLO commit to build with LLVM commit hash 00d648b
 - Update TorchToMhlo code to work with Stablehlo
 - Re-enabled two failing TOSA tests, thus resolving Github Issue #1231
AmosLewis pushed a commit to AmosLewis/torch-mlir that referenced this issue Sep 2, 2022
 - Update MHLO commit to build with LLVM commit hash 00d648b
 - Update TorchToMhlo code to work with Stablehlo
 - Re-enabled two failing TOSA tests, thus resolving Github Issue llvm#1231
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants