ARROW-17836: [C++] Allow specifying alignment of buffers #14225

save-buffer · 2022-09-24T02:32:00Z

No description provided.

github-actions · 2022-09-24T02:32:21Z

https://issues.apache.org/jira/browse/ARROW-17836

github-actions · 2022-09-24T02:32:22Z

⚠️ Ticket has no components in JIRA, make sure you assign one.

github-actions · 2022-09-24T02:32:23Z

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

pitrou

Assuming we want this, the API additions seem reasonable.
(but allocations should already be 64-byte aligned: do you actually need more? if so, why?)

cpp/src/arrow/testing/random.h

cpp/src/arrow/memory_pool.h

cpp/src/arrow/buffer_builder.h

pitrou · 2022-09-27T09:05:51Z

cpp/src/arrow/compute/exec/test_util.h

Same question as for testing/random.h: is this useful?

save-buffer · 2022-09-29T19:10:36Z

Seems like the only CI issue is

`actual$x`:   "2018-10-07T19:04:05" NA
[24011](https://github.com/apache/arrow/actions/runs/3147572706/jobs/5117455486#step:11:24012)
`expected$x`: "2018-10-07 19:04:05" NA
[24012](https://github.com/apache/arrow/actions/runs/3147572706/jobs/5117455486#step:11:24013)

I see this issue on some other PRs and also I don't touch this test at all in my code, so I think this is fine.

save-buffer · 2022-10-03T18:59:49Z

Rebasing, the R errors are gone but the scanner test seems to be failing:

terminate called after throwing an instance of 'std::system_error'
[263](https://github.com/apache/arrow/actions/runs/3176058214/jobs/5174896904#step:11:264)
  what():  Resource temporarily unavailable
[264](https://github.com/apache/arrow/actions/runs/3176058214/jobs/5174896904#step:11:265)
terminate called recursively
[265](https://github.com/apache/arrow/actions/runs/3176058214/jobs/5174896904#step:11:266)

Again, probably not me. I'd guess related to Weston's scanner refactor.

pitrou · 2022-10-03T19:08:42Z

@westonpace ^^

pitrou · 2022-10-04T10:30:35Z

Also opened https://issues.apache.org/jira/browse/ARROW-17927 for a similar issue on scanner tests.

westonpace

Sorry about the earlier test troubles. I think the API is fine. If we really wanted to avoid the extra arguments to buffer builder / etc. we could work around it with something like:

class ARROW_EXPORT EnforcedAlignmentMemoryPool : public MemoryPool {
 public:
  EnforcedAlignmentMemoryPool(int64_t default_alignment, MemoryPool* target)
      : default_alignment_(default_alignment), target_(target) {}
  Status Allocate(int64_t size, uint8_t** out) {
    return target_->Allocate(size, default_alignment_, out);
  }
  virtual Status Allocate(int64_t size, int64_t alignment, uint8_t** out) {
    if (alignment != default_alignment_) {
      return Status::Invalid("Enforced alignment ", default_alignment_,
                             " but was asked for ", alignment);
    }
    return target_->Allocate(size, alignment, out);
  }
// ...

That being said, I think I slightly prefer changing the buffer builders to take in an alignment argument as those are user facing and it makes things more clear to the user. I'm +0 on the changes to test utilities.

Could we add maybe one unit test to memory_test.cc just to provide a quick check in case someone wants to create a new memory pool implementation in the future.

pitrou

Thanks for the update!
Besides the minor comments I posted, I think we should add tests in two places at least:

for MemoryPool in arrow/memory_pool_test.cc
for BufferBuilder in arrow/buffer_test.cc

cpp/src/arrow/array/builder_base.h

cpp/src/arrow/array/builder_dict.h

cpp/src/arrow/buffer_builder.h

cpp/src/arrow/testing/random.h

r/src/memorypool.cpp

pitrou · 2022-11-24T16:41:59Z

@save-buffer I added some tests for the random array generator and it turns out that list arrays may not end up aligned due to ListArray::FromArrays. That can be for a later PR though. What do you think?

pitrou · 2022-11-24T16:45:41Z

@github-actions crossbow submit -g cpp

github-actions · 2022-11-24T16:49:42Z

Revision: 3948cb2

Submitted crossbow builds: ursacomputing/crossbow @ actions-0ee166c579

Task	Status
test-alpine-linux-cpp
test-build-cpp-fuzz
test-conda-cpp
test-conda-cpp-valgrind
test-cuda-cpp
test-debian-10-cpp-amd64
test-debian-10-cpp-i386
test-debian-11-cpp-amd64
test-debian-11-cpp-i386
test-fedora-35-cpp
test-ubuntu-18.04-cpp
test-ubuntu-18.04-cpp-release
test-ubuntu-18.04-cpp-static
test-ubuntu-20.04-cpp
test-ubuntu-20.04-cpp-20
test-ubuntu-20.04-cpp-bundled
test-ubuntu-20.04-cpp-thread-sanitizer
test-ubuntu-22.04-cpp

save-buffer · 2022-11-25T07:44:05Z

Thanks for adding those, yes I agree this PR is big enough for now. Adding support for Lists the issue is that the buffers passed in may not be aligned, right? This may be more convenient to do after we have a utility that will reallocate/copy to a specific alignment if needed (I think Sanjiban is working on something like that?)

sanjibansg · 2022-11-25T08:06:07Z

Thanks for adding those, yes I agree this PR is big enough for now. Adding support for Lists the issue is that the buffers passed in may not be aligned, right? This may be more convenient to do after we have a utility that will reallocate/copy to a specific alignment if needed (I think Sanjiban is working on something like that?)

Yes, I already have that utility ready. I am waiting for this PR to get merged and I will then make a PR for ensuring alignment.

pitrou · 2022-11-25T08:45:32Z

Adding support for Lists the issue is that the buffers passed in may not be aligned, right?

No, it's just that random generation for lists calls ListArray::FromArrays, which can allocate new buffers in some cases, and that method doesn't take the alignment parameter yet.

ursabot · 2022-11-26T18:12:20Z

Benchmark runs are scheduled for baseline = 63f013c and contender = 2078af7. 2078af7 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.84% ⬆️0.57%] test-mac-arm
[Finished ⬇️0.82% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️1.12% ⬆️0.98%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 2078af7c ec2-t3-xlarge-us-east-2
[Finished] 2078af7c test-mac-arm
[Finished] 2078af7c ursa-i9-9960x
[Finished] 2078af7c ursa-thinkcentre-m75q
[Finished] 63f013cd ec2-t3-xlarge-us-east-2
[Finished] 63f013cd test-mac-arm
[Finished] 63f013cd ursa-i9-9960x
[Finished] 63f013cd ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Lead-authored-by: Sasha Krassovsky <krassovskysasha@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>

github-actions bot added the Component: C++ label Sep 24, 2022

save-buffer force-pushed the sasha_alignment branch 2 times, most recently from e8091e8 to cff0300 Compare September 26, 2022 21:24

pitrou reviewed Sep 27, 2022

View reviewed changes

save-buffer force-pushed the sasha_alignment branch from 8ba3697 to 4d45225 Compare September 27, 2022 19:12

github-actions bot added the Component: Java label Sep 28, 2022

save-buffer force-pushed the sasha_alignment branch from c398b3e to bf873b1 Compare September 28, 2022 22:20

github-actions bot added the Component: R label Sep 28, 2022

save-buffer force-pushed the sasha_alignment branch 2 times, most recently from 22d0d99 to 6a22194 Compare October 3, 2022 17:13

save-buffer force-pushed the sasha_alignment branch 2 times, most recently from 333882c to 6a52954 Compare October 10, 2022 18:06

save-buffer force-pushed the sasha_alignment branch from 6a52954 to cb2b28d Compare October 28, 2022 21:08

save-buffer force-pushed the sasha_alignment branch from cb2b28d to e9b3ac0 Compare November 5, 2022 01:08

save-buffer requested a review from pitrou November 8, 2022 18:48

westonpace reviewed Nov 8, 2022

View reviewed changes

pitrou requested changes Nov 10, 2022

View reviewed changes

save-buffer force-pushed the sasha_alignment branch 2 times, most recently from 7816209 to 134fca7 Compare November 17, 2022 01:13

save-buffer requested review from pitrou and westonpace and removed request for pitrou and westonpace November 17, 2022 01:36

save-buffer requested review from pitrou and removed request for pitrou November 17, 2022 01:36

save-buffer and others added 2 commits November 24, 2022 16:54

Allow specifying alignment of buffers

dbe7105

Add tests for random array alignment, fix some bugs in random generation

3948cb2

pitrou force-pushed the sasha_alignment branch from b1db9e3 to 3948cb2 Compare November 24, 2022 16:40

Also skip on nested list

e9a957a

pitrou approved these changes Nov 25, 2022

View reviewed changes

pitrou merged commit 2078af7 into apache:master Nov 25, 2022

asfimport mentioned this pull request Nov 26, 2022

[C++] Allow specifying of alignment in MemoryPool's allocations #33056

Closed

ARROW-17836: [C++] Allow specifying alignment of buffers #14225

ARROW-17836: [C++] Allow specifying alignment of buffers #14225

Uh oh!

Conversation

save-buffer commented Sep 24, 2022

Uh oh!

github-actions bot commented Sep 24, 2022

Uh oh!

github-actions bot commented Sep 24, 2022

Uh oh!

github-actions bot commented Sep 24, 2022

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pitrou Sep 27, 2022

Choose a reason for hiding this comment

Uh oh!

save-buffer commented Sep 29, 2022

Uh oh!

save-buffer commented Oct 3, 2022

Uh oh!

pitrou commented Oct 3, 2022

Uh oh!

pitrou commented Oct 4, 2022

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pitrou commented Nov 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pitrou commented Nov 24, 2022

Uh oh!

github-actions bot commented Nov 24, 2022

Uh oh!

save-buffer commented Nov 25, 2022

Uh oh!

sanjibansg commented Nov 25, 2022

Uh oh!

pitrou commented Nov 25, 2022

Uh oh!

ursabot commented Nov 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pitrou commented Nov 24, 2022 •

edited

Loading