Skip to content

Conversation

@mrkn
Copy link
Member

@mrkn mrkn commented Oct 15, 2018

This commit defines the new NumericTensor class as a subclass
of Tensor class. NumericTensor extends Tensor class by adding
a member function to access element values in a tensor.

I want to use this new feature for writing tests of SparseTensor in #2546.

@mrkn mrkn changed the title Introduce NumericTensor class ARROW-3515: Introduce NumericTensor class Oct 15, 2018
@mrkn mrkn changed the title ARROW-3515: Introduce NumericTensor class ARROW-3515: [C++] Introduce NumericTensor class Oct 15, 2018
@mrkn mrkn force-pushed the tensor_element_access branch 6 times, most recently from 8b7241e to bcab9fd Compare October 15, 2018 08:46
@codecov-io
Copy link

codecov-io commented Oct 15, 2018

Codecov Report

Merging #2759 into master will increase coverage by 1%.
The diff coverage is 96.42%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #2759     +/-   ##
=========================================
+ Coverage   87.55%   88.56%     +1%     
=========================================
  Files         403      342     -61     
  Lines       62113    58379   -3734     
=========================================
- Hits        54386    51705   -2681     
+ Misses       7657     6674    -983     
+ Partials       70        0     -70
Impacted Files Coverage Δ
cpp/src/arrow/tensor-test.cc 100% <100%> (ø) ⬆️
cpp/src/arrow/tensor.h 88.88% <100%> (+4.27%) ⬆️
cpp/src/arrow/tensor.cc 96.96% <86.66%> (-3.04%) ⬇️
cpp/src/arrow/type_traits.h 95.08% <0%> (-1.59%) ⬇️
rust/src/record_batch.rs
go/arrow/datatype_nested.go
rust/src/util/bit_util.rs
go/arrow/math/uint64_amd64.go
go/arrow/internal/testing/tools/bool.go
go/arrow/internal/bitutil/bitutil.go
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0689a58...de070c5. Read the comment docs.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple nits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strides can be negative here or should you add "non-negative" as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. I'll also add "non-negative" also in the same comment in Tensor's definition because I copied it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why you're special-casing index.size() - 1 here. You could have:

    for (size_t i = 0; i < index.size(); ++i) {
      offset = index[i] + offset * shape_[i];
    }
    offset *= element_size;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching my bad. I think it may be a trace of implementation trials.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On some CPU architectures, this assumes the data is naturally aligned. It probably doesn't matter for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to tackle this is to make CalculateValueOffset return an offset in units of TYPE instead of bytes. It'll also make it easier to work with non-fixed-bytes in the future (also possibly correcting the problem with BOOL).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this function to return an offset in units of TYPE.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please undo the change (sorry). The strides are in bytes units, so the offset has to be calculated in bytes units. When you're dividing the bytes offset by itemsize, you have no guarantee that there is no remainder (though that will be the common case).

In any case, the change didn't fix the issue, as raw_data() could be misaligned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was used for temporary debugging? You should remove it now :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I should. I removed it.

@mrkn mrkn force-pushed the tensor_element_access branch 2 times, most recently from 6c7731d to 3df4cb1 Compare October 16, 2018 00:47
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to tackle this is to make CalculateValueOffset return an offset in units of TYPE instead of bytes. It'll also make it easier to work with non-fixed-bytes in the future (also possibly correcting the problem with BOOL).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to fail when TYPE = BOOL.

Copy link
Member Author

@mrkn mrkn Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But Tensor class currently doesn't support TYPE = BOOL.
I think it's great if Tensor class supports BOOL element, and I want to try to implement it after SparseTensor.

@mrkn mrkn force-pushed the tensor_element_access branch 3 times, most recently from de070c5 to 5f3f1cb Compare October 16, 2018 08:31
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coverage result tells me a Tensor always fills its strides_, so I can remove this else clause.

@mrkn mrkn changed the title ARROW-3515: [C++] Introduce NumericTensor class WIP: ARROW-3515: [C++] Introduce NumericTensor class Oct 17, 2018
@mrkn mrkn force-pushed the tensor_element_access branch from 5f3f1cb to 7afbca5 Compare October 24, 2018 08:05
mrkn added 3 commits October 25, 2018 06:03
This commit defines the new NumericTensor<T> class as a subclass
of Tensor class. NumericTensor<T> extends Tensor class by adding
a member function to access element values in a tensor.
Tensor's strides_ is always filled.
@mrkn mrkn force-pushed the tensor_element_access branch from 16baa39 to 37f0bb4 Compare October 24, 2018 21:06
@mrkn mrkn changed the title WIP: ARROW-3515: [C++] Introduce NumericTensor class ARROW-3515: [C++] Introduce NumericTensor class Oct 25, 2018
@mrkn
Copy link
Member Author

mrkn commented Oct 25, 2018

Travis CI was failed but the cause of the failure may be GitHub's authentication error.
https://travis-ci.org/apache/arrow/jobs/445886028#L2638

@mrkn
Copy link
Member Author

mrkn commented Oct 25, 2018

@pitrou I think this pull-request was completed.
Would you please review it again, and merge it if there are no problems.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@pitrou pitrou closed this in 0b9fad3 Oct 25, 2018
@mrkn mrkn deleted the tensor_element_access branch October 25, 2018 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants