Adding GeLU operator (used in Gpt2) #397

Narsil · 2023-01-24T21:12:32Z

Added the operations as found here:

https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/torch/_decomp/decompositions.py#L217
https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/caffe2/operators/gelu_op.cu#L84
https://pytorch.org/docs/stable/generated/torch.nn.GELU.html

Added both cuda and cpu versions.
Added a simplify helper method, because the results are slightly
different between cuda and cpu (this is relatively common IMO).

- Added the operations as found here: https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/torch/_decomp/decompositions.py#L217 https://github.com/pytorch/pytorch/blob/acdd462b1a070790799ce4623ce8ecc83e197e81/caffe2/operators/gelu_op.cu#L84 https://pytorch.org/docs/stable/generated/torch.nn.GELU.html - Added both cuda and cpu versions. - Added a `simplify` helper method, because the results are slightly different between cuda and cpu (this is relatively common IMO).

coreylowman

Looks great! Small nits and then good to go. We can use this PR as a model for how to contribute new unary ops 🔥

src/tensor_ops/gelu/cpu_kernel.rs

coreylowman · 2023-01-24T21:30:03Z

src/tensor_ops/gelu/mod.rs

+        let x = dev.tensor([-2.0, -1.0, 0.0, 1.0, 2.0]);
+        let r = x.trace().gelu();
+        assert_eq!(
+            simplify(&r.array()),


Yeah definitely seen slightly different results between cpu/cuda. Can you use crate::tests::assert_close() that some of the other tests use? This checks equality within a certain tolerance

Ahh yeah looks like there's small failure in github actions, I think moving to assert_close should resolve that (probably don't need simplify)

Narsil · 2023-01-25T07:56:02Z

The last failing test is the example I think, were I cannot use assert_close (afaik).

Should assert_close be opened ? Or maybe relax the assertion in a simple print statement ?

Co-authored-by: Corey Lowman <coreylowman@users.noreply.github.com>

Narsil · 2023-01-25T09:26:39Z

Why was the last test cancelled ??

coreylowman · 2023-01-25T14:09:54Z

Why was the last test cancelled ??

No idea, but the rest of them passed so looks good to me!

coreylowman · 2023-01-25T14:11:24Z

The last failing test is the example I think, were I cannot use assert_close (afaik).

Should assert_close be opened ? Or maybe relax the assertion in a simple print statement ?

Ah yeah, for doctests you don't need to call any assertions about data, they are more about just showing usage. You can just remove those lines

coreylowman

Looks great from my end!

src/tensor_ops/gelu/mod.rs

Co-authored-by: Corey Lowman <coreylowman@users.noreply.github.com>

Narsil · 2023-01-25T16:37:35Z

I'll let you merge in then ;) cheers !

coreylowman · 2023-01-25T21:06:15Z

Nice contribution!

coreylowman linked an issue Jan 24, 2023 that may be closed by this pull request

Add tensor_ops::gelu and nn::GELU #288

Closed

coreylowman reviewed Jan 24, 2023

View reviewed changes

This was referenced Jan 24, 2023

Add tensor_ops::hardswish and nn::Hardswish #290

Open

Add tensor_ops::leaky_relu and nn::LeakyReLU #287

Closed

Add tensor_ops::softplus and nn::SoftPlus #289

Open

Using assert_close.

e8c4010

Narsil and others added 2 commits January 25, 2023 08:57

Remove the assert in the doctest.

870b152

Update src/tensor_ops/gelu/cpu_kernel.rs

820caa2

Co-authored-by: Corey Lowman <coreylowman@users.noreply.github.com>

coreylowman approved these changes Jan 25, 2023

View reviewed changes

src/tensor_ops/gelu/mod.rs Outdated Show resolved Hide resolved

Update src/tensor_ops/gelu/mod.rs

b17136f

Co-authored-by: Corey Lowman <coreylowman@users.noreply.github.com>

coreylowman merged commit a1eab9b into coreylowman:main Jan 25, 2023

coreylowman mentioned this pull request Jan 25, 2023

Reorganize tensor_ops, and add cuda_utils.cuh #398

Merged

Narsil deleted the adding_gelu_ops branch January 26, 2023 08:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding GeLU operator (used in Gpt2) #397

Adding GeLU operator (used in Gpt2) #397

Narsil commented Jan 24, 2023

coreylowman left a comment

coreylowman Jan 24, 2023

coreylowman Jan 24, 2023

Narsil commented Jan 25, 2023

Narsil commented Jan 25, 2023

coreylowman commented Jan 25, 2023

coreylowman commented Jan 25, 2023

coreylowman left a comment

Narsil commented Jan 25, 2023

coreylowman commented Jan 25, 2023

Adding GeLU operator (used in Gpt2) #397

Adding GeLU operator (used in Gpt2) #397

Conversation

Narsil commented Jan 24, 2023

coreylowman left a comment

Choose a reason for hiding this comment

coreylowman Jan 24, 2023

Choose a reason for hiding this comment

coreylowman Jan 24, 2023

Choose a reason for hiding this comment

Narsil commented Jan 25, 2023

Narsil commented Jan 25, 2023

coreylowman commented Jan 25, 2023

coreylowman commented Jan 25, 2023

coreylowman left a comment

Choose a reason for hiding this comment

Narsil commented Jan 25, 2023

coreylowman commented Jan 25, 2023