-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding GeLU operator (used in Gpt2) #397
Conversation
- Added the operations as found here: https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/torch/_decomp/decompositions.py#L217 https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/caffe2/operators/gelu_op.cu#L84 https://pytorch.org/docs/stable/generated/torch.nn.GELU.html - Added both cuda and cpu versions. - Added a `simplify` helper method, because the results are slightly different between cuda and cpu (this is relatively common IMO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Small nits and then good to go. We can use this PR as a model for how to contribute new unary ops 🔥
src/tensor_ops/gelu/mod.rs
Outdated
let x = dev.tensor([-2.0, -1.0, 0.0, 1.0, 2.0]); | ||
let r = x.trace().gelu(); | ||
assert_eq!( | ||
simplify(&r.array()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah definitely seen slightly different results between cpu/cuda. Can you use crate::tests::assert_close()
that some of the other tests use? This checks equality within a certain tolerance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yeah looks like there's small failure in github actions, I think moving to assert_close should resolve that (probably don't need simplify)
The last failing test is the example I think, were I cannot use Should |
Co-authored-by: Corey Lowman <coreylowman@users.noreply.github.com>
Why was the last test cancelled ?? |
No idea, but the rest of them passed so looks good to me! |
Ah yeah, for doctests you don't need to call any assertions about data, they are more about just showing usage. You can just remove those lines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great from my end!
Co-authored-by: Corey Lowman <coreylowman@users.noreply.github.com>
I'll let you merge in then ;) cheers ! |
Nice contribution! |
https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/torch/_decomp/decompositions.py#L217
https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/caffe2/operators/gelu_op.cu#L84
https://pytorch.org/docs/stable/generated/torch.nn.GELU.html
simplify
helper method, because the results are slightlydifferent between cuda and cpu (this is relatively common IMO).