ARROW-3515: [C++] Introduce NumericTensor class #2759

mrkn · 2018-10-15T07:08:32Z

This commit defines the new NumericTensor class as a subclass
of Tensor class. NumericTensor extends Tensor class by adding
a member function to access element values in a tensor.

I want to use this new feature for writing tests of SparseTensor in #2546.

codecov-io · 2018-10-15T09:31:43Z

Codecov Report

Merging #2759 into master will increase coverage by 1%.
The diff coverage is 96.42%.

@@            Coverage Diff            @@
##           master    #2759     +/-   ##
=========================================
+ Coverage   87.55%   88.56%     +1%     
=========================================
  Files         403      342     -61     
  Lines       62113    58379   -3734     
=========================================
- Hits        54386    51705   -2681     
+ Misses       7657     6674    -983     
+ Partials       70        0     -70

Impacted Files	Coverage Δ
cpp/src/arrow/tensor-test.cc	`100% <100%> (ø)`	⬆️
cpp/src/arrow/tensor.h	`88.88% <100%> (+4.27%)`	⬆️
cpp/src/arrow/tensor.cc	`96.96% <86.66%> (-3.04%)`	⬇️
cpp/src/arrow/type_traits.h	`95.08% <0%> (-1.59%)`	⬇️
rust/src/record_batch.rs
go/arrow/datatype_nested.go
rust/src/util/bit_util.rs
go/arrow/math/uint64_amd64.go
go/arrow/internal/testing/tools/bool.go
go/arrow/internal/bitutil/bitutil.go
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0689a58...de070c5. Read the comment docs.

pitrou

LGTM, just a couple nits.

pitrou · 2018-10-15T15:20:03Z

cpp/src/arrow/tensor.h

Strides can be negative here or should you add "non-negative" as well?

I agree with you. I'll also add "non-negative" also in the same comment in Tensor's definition because I copied it.

pitrou · 2018-10-15T15:44:58Z

cpp/src/arrow/tensor.cc

I don't get why you're special-casing index.size() - 1 here. You could have:

for (size_t i = 0; i < index.size(); ++i) { offset = index[i] + offset * shape_[i]; } offset *= element_size;

Thank you for catching my bad. I think it may be a trace of implementation trials.

pitrou · 2018-10-15T15:46:54Z

cpp/src/arrow/tensor.h

On some CPU architectures, this assumes the data is naturally aligned. It probably doesn't matter for now.

One way to tackle this is to make CalculateValueOffset return an offset in units of TYPE instead of bytes. It'll also make it easier to work with non-fixed-bytes in the future (also possibly correcting the problem with BOOL).

I changed this function to return an offset in units of TYPE.

Please undo the change (sorry). The strides are in bytes units, so the offset has to be calculated in bytes units. When you're dividing the bytes offset by itemsize, you have no guarantee that there is no remainder (though that will be the common case).

In any case, the change didn't fix the issue, as raw_data() could be misaligned.

pitrou · 2018-10-15T15:47:17Z

cpp/build-support/run_clang_format.py

This was used for temporary debugging? You should remove it now :-)

Yes, I should. I removed it.

fsaintjacques · 2018-10-16T00:51:16Z

cpp/src/arrow/tensor.h

One way to tackle this is to make CalculateValueOffset return an offset in units of TYPE instead of bytes. It'll also make it easier to work with non-fixed-bytes in the future (also possibly correcting the problem with BOOL).

fsaintjacques · 2018-10-16T00:52:25Z

cpp/src/arrow/tensor.cc

This is going to fail when TYPE = BOOL.

Yes. But Tensor class currently doesn't support TYPE = BOOL.
I think it's great if Tensor class supports BOOL element, and I want to try to implement it after SparseTensor.

mrkn · 2018-10-17T09:01:02Z

cpp/src/arrow/tensor.cc

The coverage result tells me a Tensor always fills its strides_, so I can remove this else clause.

This commit defines the new NumericTensor<T> class as a subclass of Tensor class. NumericTensor<T> extends Tensor class by adding a member function to access element values in a tensor.

Tensor's strides_ is always filled.

mrkn · 2018-10-25T00:19:22Z

Travis CI was failed but the cause of the failure may be GitHub's authentication error.
https://travis-ci.org/apache/arrow/jobs/445886028#L2638

mrkn · 2018-10-25T00:31:03Z

@pitrou I think this pull-request was completed.
Would you please review it again, and merge it if there are no problems.

pitrou

LGTM, thank you!

mrkn changed the title ~~Introduce NumericTensor class~~ ARROW-3515: Introduce NumericTensor class Oct 15, 2018

mrkn changed the title ~~ARROW-3515: Introduce NumericTensor class~~ ARROW-3515: [C++] Introduce NumericTensor class Oct 15, 2018

mrkn force-pushed the tensor_element_access branch 6 times, most recently from 8b7241e to bcab9fd Compare October 15, 2018 08:46

pitrou reviewed Oct 15, 2018

View reviewed changes

mrkn force-pushed the tensor_element_access branch 2 times, most recently from 6c7731d to 3df4cb1 Compare October 16, 2018 00:47

fsaintjacques requested changes Oct 16, 2018

View reviewed changes

mrkn force-pushed the tensor_element_access branch 3 times, most recently from de070c5 to 5f3f1cb Compare October 16, 2018 08:31

mrkn commented Oct 17, 2018

View reviewed changes

mrkn changed the title ~~ARROW-3515: [C++] Introduce NumericTensor class~~ WIP: ARROW-3515: [C++] Introduce NumericTensor class Oct 17, 2018

mrkn force-pushed the tensor_element_access branch from 5f3f1cb to 7afbca5 Compare October 24, 2018 08:05

mrkn added 3 commits October 25, 2018 06:03

Introduce NumericTensor class

1646461

This commit defines the new NumericTensor<T> class as a subclass of Tensor class. NumericTensor<T> extends Tensor class by adding a member function to access element values in a tensor.

Remove needless cases

14fa527

Tensor's strides_ is always filled.

Add tests for column-major strides

37f0bb4

mrkn force-pushed the tensor_element_access branch from 16baa39 to 37f0bb4 Compare October 24, 2018 21:06

mrkn changed the title ~~WIP: ARROW-3515: [C++] Introduce NumericTensor class~~ ARROW-3515: [C++] Introduce NumericTensor class Oct 25, 2018

pitrou approved these changes Oct 25, 2018

View reviewed changes

pitrou closed this in 0b9fad3 Oct 25, 2018

mrkn deleted the tensor_element_access branch October 25, 2018 07:37

asfimport mentioned this pull request Jun 3, 2019

Introduce NumericTensor class #19832

Closed

ARROW-3515: [C++] Introduce NumericTensor class #2759

ARROW-3515: [C++] Introduce NumericTensor class #2759

Uh oh!

Conversation

mrkn commented Oct 15, 2018 • edited by pitrou Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Oct 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrkn Oct 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrkn commented Oct 25, 2018

Uh oh!

mrkn commented Oct 25, 2018

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mrkn commented Oct 15, 2018 •

edited by pitrou

Loading

codecov-io commented Oct 15, 2018 •

edited

Loading

mrkn Oct 16, 2018 •

edited

Loading