[numpy] Add torch.nan_to_num #44592

kshitij12345 · 2020-09-12T11:52:13Z

Reference #42515

TODO:

Add tests
Add docs

dr-ci · 2020-09-12T12:03:15Z

💊 CI failures summary and remediations

As of commit 6141370 (more details on the Dr. CI page):

1/2 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)
1/2 broken upstream at merge base bb19a55 on Sep 29 from 8:58am to 9:31am PDT (3 commits; 0a38aed - b3135c2)

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

Since your merge base is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.

pytorch_macos_10_13_py3_test on Sep 29 from 8:58am to 9:31am PDT (3 commits; 0a38aed - b3135c2)
- 🔁 rerun

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm3.7-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 82 times.

codecov · 2020-09-13T15:26:44Z

Codecov Report

Merging #44592 into master will decrease coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #44592      +/-   ##
==========================================
- Coverage   68.15%   68.07%   -0.09%     
==========================================
  Files         396      393       -3     
  Lines       51133    50990     -143     
==========================================
- Hits        34851    34709     -142     
+ Misses      16282    16281       -1

Impacted Files	Coverage Δ
torch/overrides.py	`97.08% <ø> (ø)`
...ch/testing/_internal/common_methods_invocations.py	`91.41% <ø> (ø)`
torch/_tensor_docs.py	`100.00% <100.00%> (ø)`
torch/_torch_docs.py	`100.00% <100.00%> (ø)`
torch/distributed/rpc/options.py	`33.33% <0.00%> (-50.01%)`	⬇️
torch/distributed/rpc/backend_registry.py	`32.35% <0.00%> (-16.04%)`	⬇️
torch/distributed/optim/optimizer.py	`29.78% <0.00%> (-7.57%)`	⬇️
torch/nn/quantized/modules/conv.py	`85.25% <0.00%> (-4.32%)`	⬇️
torch/optim/adagrad.py	`79.03% <0.00%> (-4.31%)`	⬇️
torch/testing/_internal/dist_utils.py	`33.33% <0.00%> (-2.11%)`	⬇️
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bb19a55...6141370. Read the comment docs.

kshitij12345 · 2020-09-21T16:02:13Z

@mruberry Please take a look :)

Have made a independent test as I dont think that UnaryUfuncInfo can take care of nan, posinfo, neginfo arguments.

gradcheck works but for gradgradcheck, we get

RuntimeError: isposinf(): functions with out=... arguments don't support automatic differentiation, but one of the arguments requires grad.

Should we just manually add zeros_like for the nan_to_num_backward?

albanD · 2020-09-21T16:14:17Z

Hi,

For the formula issue, it is an oversight on our side sorry.
Could you add to this list a new line stating that functions returning boolean type are not differentiable and add the strings "isnan", "isposinf", "isneginf" and "isinf" there. That will solve your issue.

Also for the backward function, I think you can use isinf instead of doing two calls for the pos and neg infs.

* update derivative not required ops. * add gradgradcheck to test. * use isinf.

kshitij12345 · 2020-09-21T17:08:09Z

@albanD

Could you add to this list a new line stating that functions returning boolean type are not differentiable and add the strings "isnan", "isposinf", "isneginf" and "isinf" there. That will solve your issue.

Thanks that worked!

Also for the backward function, I think you can use isinf instead of doing two calls for the pos and neg infs.

Ah. Right! Thanks!

albanD

The autograd part looks good to me. Just small comment on the test.

I'll let @mruberry review the rest.

test/test_autograd.py

kshitij12345 · 2020-09-26T07:00:56Z

On CPU

With

    cpu_kernel(iter, [=](scalar_t a) -> scalar_t {
      return (
          at::_isnan(a)
              ? nan_replacement
              : (a == std::numeric_limits<scalar_t>::infinity()
                     ? pos_inf_replacement
                     : (a == -std::numeric_limits<scalar_t>::infinity()
                            ? neg_inf_replacement
                            : a)));

250 / 250
Average attemts per valid config: 24.1
 Best:                   Size    Shape             X order        Y order
--------------------------------------------------------------------------------
 4.6 ns / element       88236 |   57, 1548       | contiguous   | 
 4.6 ns / element       85680 |  272,  315       | [1 0]        | 
 4.6 ns / element      114070 | 3355,   34       | contiguous   | 
 4.6 ns / element      109872 |  763,  144       | contiguous   | 
 4.6 ns / element      129925 | 5197,   25       | contiguous   | 
 4.6 ns / element      114528 | 3579,   32       | contiguous   | 
 4.6 ns / element      103488 |   56,   88,   21 | contiguous   | 
 4.6 ns / element       97682 |  578,  169       | contiguous   | 
 4.6 ns / element      127932 |   21, 6092       | contiguous   | 
 4.6 ns / element       91840 |  328,  280       | contiguous   | 

Worst:                   Size    Shape             X order        Y order
--------------------------------------------------------------------------------
 4.6 ns / element       80325 |   25,  189,   17 | contiguous   | 
 4.6 ns / element       78645 |   49, 1605       | contiguous   | 
 4.6 ns / element       93024 | 1224,   76       | contiguous   | 
 4.6 ns / element       69608 |   88,  791       | contiguous   | 
 4.6 ns / element       83204 |   62, 1342       | [1 0]        | 
 4.7 ns / element       78336 |   72, 1088       | contiguous   | 
 4.7 ns / element      118020 | 5901,   20       | contiguous   | 
 4.7 ns / element      122960 |  106,   40,   29 | contiguous   | 
 4.7 ns / element       66848 | 4178,   16       | contiguous   | 
 4.9 ns / element       69108 | 1772,   39       | contiguous   |

With

cpu_kernel(iter, [=](scalar_t a) -> scalar_t {
      return at::_isfinite(a)
          ? a
          : (at::_isnan(a) ? nan_replacement
                           : (a == std::numeric_limits<scalar_t>::infinity()
                                  ? pos_inf_replacement
                                  : neg_inf_replacement));

250 / 250
Average attemts per valid config: 24.1
 Best:                   Size    Shape             X order        Y order
--------------------------------------------------------------------------------
 6.3 ns / element       66066 |  462,  143       | contiguous   | 
 6.3 ns / element      103488 |   56,   88,   21 | contiguous   | 
 6.3 ns / element       69930 |  945,   74       | contiguous   | 
 6.3 ns / element       74316 |   44, 1689       | contiguous   | 
 6.3 ns / element      114528 | 3579,   32       | contiguous   | 
 6.3 ns / element       88236 |   57, 1548       | contiguous   | 
 6.3 ns / element      109872 |  763,  144       | contiguous   | 
 6.3 ns / element       66880 |  190,  352       | contiguous   | 
 6.3 ns / element      114730 |   35, 3278       | contiguous   | 
 6.3 ns / element       76342 |   41, 1862       | contiguous   | 

Worst:                   Size    Shape             X order        Y order
--------------------------------------------------------------------------------
 6.4 ns / element       78336 |   72, 1088       | contiguous   | 
 6.4 ns / element      115080 | 5754,   20       | contiguous   | 
 6.4 ns / element       80080 | 5005,   16       | contiguous   | 
 6.5 ns / element       79968 |  392,  204       | contiguous   | 
 6.5 ns / element       97482 |  422,  231       | contiguous   | 
 6.5 ns / element      127932 |   21, 6092       | contiguous   | 
 8.0 ns / element      104348 | 1373,   76       | [1 0]        | 
 8.1 ns / element       80166 |  186,  431       | [1 0]        | 
 8.3 ns / element       89870 |  430,  209       | contiguous   | 
 8.3 ns / element      102752 |  247,   16,   26 | contiguous   |

benchmark.txt

kshitij12345 · 2020-09-28T06:45:03Z

@mruberry PTAL :)

Have addressed the comments as well as updated the kernel.

Errors are not related to this PR. Will rebase after addressing new comments if any.

Thanks!

mruberry · 2020-09-28T09:15:17Z

test/test_unary_ufuncs.py

+            for x in generate_numeric_tensors(dtype=dtype, device=device):
+                if not contiguous:
+                    x = x.T
+                self.compare_with_numpy(torch.nan_to_num, np.nan_to_num, x)


Will this and the next comparison be covered by the existing unary ufunc tests?

That is, I'm wondering if only the tests where the extra args are set are needed in this test?

I am not sure about method variant but torch.nan_to_num is verified. You know where the method variants are tested?

Variants are tested for consistency here:

pytorch/test/test_unary_ufuncs.py

Line 200 in 7cde662

def test_variant_consistency(self, device, dtype, op):

Which is not the most thorough test, but we didn't want to run the tests again for method variants because it would greatly increase test time (and cost).

Ah. Will remove those two then. Thanks!

mruberry · 2020-09-28T09:17:38Z

test/test_unary_ufuncs.py

+    @dtypes(*(torch.testing.get_all_int_dtypes() + torch.testing.get_all_fp_dtypes(include_bfloat16=False)))
+    def test_nan_to_num(self, device, dtype):
+        for contiguous in [False, True]:
+            for x in generate_numeric_tensors(dtype=dtype, device=device):


I wonder if this test should use a different generator with more non-finite values, since the numeric tensors will be tested elsewhere?

I think this should be fine as the generator has tensors with extremals, otherwise we'll will be creating almost similar function which does the same thing (takes dtype and device) and generates tensor. Let me know what you feel about that.

I think you can just call make_tensor:

pytorch/torch/testing/_internal/common_utils.py

Line 1432 in 7cde662

def make_tensor(size, device: torch.device, dtype: torch.dtype, *,

And add the extremal values if it's float. You could create a smaller 64x64 tensor, for example. generate_numeric_tensors will create a very large tensor.

Oh. Sure. Thanks!

mruberry · 2020-09-28T09:20:15Z

Thanks for the benchmark. PR continues to look good, of course. I just have a couple questions about refining the test in test_unary_ufunc.py to ensure it's not redundant with the other OpInfo tests. Maybe just testing some smaller tensors with more non-finite values would be best?

kshitij12345 · 2020-09-29T15:14:00Z

Gentle Ping :)

mruberry

Awesome! Exciting to have torch.nan_to_num!

kshitij12345 · 2020-10-01T17:09:21Z

Gentle Ping :)

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

mruberry · 2020-10-02T12:57:26Z

Gentle Ping :)

Thanks for the ping, and sorry for the delay. It's been a very busy week with the branch cut. Things should get back to normal soon.

mruberry · 2020-10-05T08:43:06Z

Congrats @kshitij12345! This is one of, if not the first, new function to be added to PyTorch 1.8!

facebook-github-bot · 2020-10-05T10:10:10Z

@mruberry merged this pull request in f65ab89.

Summary: Related to #44592 This PR is to fix the deprecated warnings for the nan_to_num function. Below is the warning message when building the latest code. ``` ../aten/src/ATen/native/UnaryOps.cpp: In function ‘at::Tensor& at::native::nan_to_num_out(at::Tensor&, const at::Tensor&, c10::optional<double>, c10::optional<double>, c10::optional<double>)’: ../aten/src/ATen/native/UnaryOps.cpp:397:45: warning: ‘bool c10::isIntegralType(c10::ScalarType)’ is deprecated: isIntegralType is deprecated. Please use the overload with 'includeBool' parameter instead. [-Wdeprecated-declarations] if (c10::isIntegralType(self.scalar_type())) { ``` The deprecated warning is defined in `ScalarType.h`. https://github.com/pytorch/pytorch/blob/d790ec6de01a61fe81733c41a64b6092bacfb7bd/c10/core/ScalarType.h#L255-L260 Pull Request resolved: #46309 Reviewed By: mrshenli Differential Revision: D24310248 Pulled By: heitorschueroff fbshipit-source-id: 0f9f2ad304eb5a2da9d2b415343f2fc9029037af

add nan_to_num kernel impl

f28d58b

kshitij12345 marked this pull request as draft September 12, 2020 11:53

kshitij12345 changed the title ~~[numpy] Add torch.nan_to_num~~ [WIP] [numpy] Add torch.nan_to_num Sep 12, 2020

pytorchbot added the open source label Sep 12, 2020

kshitij12345 added 8 commits September 12, 2020 18:53

Merge branch 'master' into develop/numpy/nan_to_num

1473c8a

extra machinery for c10::Half support

039001b

add docs

f2b4aba

update overrides.py

5ec18c5

update signature

90083ca

support gradient

9a0dcb4

add test vs numpy

c14529f

rename pos_inf -> posinf, neg_inf -> neginf

d159a0d

hameerabbasi mentioned this pull request Sep 15, 2020

Unary Ufunc tracking issue #42515

Closed

8 tasks

kshitij12345 marked this pull request as ready for review September 21, 2020 13:44

kshitij12345 requested review from albanD and apaszke as code owners September 21, 2020 13:44

kshitij12345 added 2 commits September 21, 2020 19:32

Merge branch 'master' into develop/numpy/nan_to_num

bab7127

remove merge stray

d9b2e59

kshitij12345 changed the title ~~[WIP] [numpy] Add torch.nan_to_num~~ [numpy] Add torch.nan_to_num Sep 21, 2020

kshitij12345 added 2 commits September 21, 2020 22:33

update argument names

cecd74a

address comment

0746035

* update derivative not required ops. * add gradgradcheck to test. * use isinf.

albanD reviewed Sep 21, 2020

View reviewed changes

test/test_autograd.py Outdated Show resolved Hide resolved

replace randn < 0.2 -> rand < 0.2

0a66921

mruberry requested review from mruberry and removed request for apaszke September 22, 2020 07:00

kshitij12345 added 5 commits September 24, 2020 21:35

update docs

c260821

replace dispatch for intergers with inplace copy_

a7ba1c9

kMerge branch 'develop/numpy/nan_to_num' into develop/numpy/nan_to_num

81258a9

move test to test_unary_funcs

d675bd7

Merge branch 'master' into develop/numpy/nan_to_num

79428e4

kshitij12345 added 2 commits September 26, 2020 12:52

update kernel impl

5288932

remove unused _isfinite

534f40b

mruberry reviewed Sep 28, 2020

View reviewed changes

kshitij12345 added 2 commits September 29, 2020 21:50

address comment

dffa5a9

Merge branch 'master' into develop/numpy/nan_to_num

6141370

mruberry self-requested a review September 29, 2020 16:40

mruberry approved these changes Sep 29, 2020

View reviewed changes

facebook-github-bot reviewed Oct 2, 2020

View reviewed changes

facebook-github-bot closed this in f65ab89 Oct 5, 2020

kshitij12345 deleted the develop/numpy/nan_to_num branch October 5, 2020 09:09

facebook-github-bot added the merged label Oct 5, 2020

RockingJavaBean mentioned this pull request Oct 14, 2020

Fix deprecated warnings for nan_to_num #46309

Closed

mruberry added the Merged label Oct 28, 2020

[numpy] Add torch.nan_to_num #44592

[numpy] Add torch.nan_to_num #44592

Uh oh!

Conversation

kshitij12345 commented Sep 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Sep 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🚧 1 fixed upstream failure:

ci.pytorch.org: 1 failed

Uh oh!

codecov bot commented Sep 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kshitij12345 commented Sep 21, 2020

Uh oh!

albanD commented Sep 21, 2020

Uh oh!

kshitij12345 commented Sep 21, 2020

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kshitij12345 commented Sep 26, 2020

Uh oh!

kshitij12345 commented Sep 28, 2020

Uh oh!

mruberry Sep 28, 2020

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Sep 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry Sep 29, 2020

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Sep 29, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry Sep 28, 2020

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Sep 28, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry Sep 29, 2020

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Sep 29, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry commented Sep 28, 2020

Uh oh!

kshitij12345 commented Sep 29, 2020

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

kshitij12345 commented Oct 1, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

mruberry commented Oct 2, 2020

Uh oh!

mruberry commented Oct 5, 2020

Uh oh!

facebook-github-bot commented Oct 5, 2020

Uh oh!

Uh oh!

kshitij12345 commented Sep 12, 2020 •

edited

Loading

dr-ci bot commented Sep 12, 2020 •

edited

Loading

codecov bot commented Sep 13, 2020 •

edited

Loading

kshitij12345 Sep 28, 2020 •

edited

Loading