feat: add `apply_unary_op` and `apply_binary_op` bitwise operations #8619

rluvaton · 2025-10-15T08:18:49Z

Which issue does this PR close?

Closes Add bitwise ops on BooleanBufferBuilder and MutableBuffer that mutate directly the buffer #8618.

Rationale for this change

Implement efficient boolean by applying them a (u64) at a time

What changes are included in this PR?

Implementation notes

Are these changes tested?

Yes, although I did not run them on big endian machine

Are there any user-facing changes?

Yes, new functions which are documented

Notes: I will later change BooleanBufferBuilder#append_packed_range function to use mutable_bitwise_bin_op_helper as I saw that running the boolean_append_packed benchmark improved by more than 2 times

boolean_append_packed   time:   [2.0079 µs 2.0139 µs 2.0202 µs]
                        change: [−57.808% −57.653% −57.494%] (p = 0.00 < 0.05)
                        Performance has improved.

See benchmarks on

TESTING: Change `BooleanBuffer::append_packed_range to use bitwise_binary_op #8744

…table. but I don't want to pass slice of bytes as then I don't know the source and users must make sure that they hold the same promises as Buffer/MutableBuffer

alamb · 2025-10-16T18:48:32Z

I will try and review this one tomorrow

alamb

Thank you @rluvaton -- I haven't made it through this PR yet but the idea of optimized bitwise operations even for offset data is very compelling. The code is also very well tested and documented in my opinion. Thank you.

My primary concern is with the complexity of this code (including the unsafe) though your tests and documentation make it much easier to contemplate. I did have a few comments so far. I think with some more study I could find

Can you please share the benchmarks you are using / any WIP? I want to confirm the performance improvements before studying this code in more detail

FYI @tustvold and @crepererum and @jhorstmann if you are interested

alamb · 2025-10-17T13:54:44Z

arrow-buffer/src/buffer/mutable_ops.rs

+/// (e.g. `BooleanBufferBuilder`).
+///
+/// ## Why this trait is needed, can't we just use `MutableBuffer` directly?
+/// Sometimes we don't want to expose the inner `MutableBuffer`


I don't understand this rationale. It seems to me that this code does expose the inner MutableBuffer for BooleanBufferBuilder (other code can modify the MutableBuffer) it just does so via a trait. I am not sure how that is different than just passing in mutable buffer directly

I wonder why you can't just pass &mut [u8] (aka pass in the mutable slices directly) as none of the APIs seem to change the length of the underlying buffers 🤔

if it is absolutely required to use a MutableBuffer directly from BooleanBufferBuilder perhaps we can make an unsafe API instead:

impl BooleanBufferBuilder { /// returns a mutable reference to the buffer and length. Callers must ensure if they change the length /// the buffer, they also update len pub unsafe fn inner(&mut self) -> (&mut MutableBuffer, &mut usize) { ... } }

🤔

Where do you see it exposing mutable buffer? It only expose the slice

And not passing bytes to be similar to buffer ops and to make sure that user understand they need to be bit packed but don't have strong opinions about the last thing

Where do you see it exposing mutable buffer? It only expose the slice

I was thinking of this code in particular, which seems to pass a MutableBuffer reference directly out of the BooleanBufferBuilder

impl MutableOpsBufferSupportedLhs for BooleanBufferBuilder { fn inner_mutable_buffer(&mut self) -> &mut MutableBuffer { &mut self.buffer } }

Yes but this is pub(crate) on purpose (documented on the trait level) to not expose it further than current crate

Here is a proposal which I think is simpler and easier to understand

remove MutableOpsBufferSupportedLhs and use unsafe mutable_buffer function rluvaton/arrow-rs#2

I originally thought about adding unsafe function but did not want to change it.

anyway, modified to be like your function

arrow-buffer/src/buffer/mutable_ops.rs

alamb · 2025-10-17T14:08:47Z

arrow-buffer/src/buffer/mutable_ops.rs

+            .map(|(l, r)| expected_op(*l, *r))
+            .collect();
+
+        super::mutable_bitwise_bin_op_helper(


this is a nice test

arrow-buffer/src/buffer/mutable/bitwise.rs

alamb · 2025-10-17T14:19:12Z

arrow-buffer/src/buffer/mutable_ops.rs

+
+    let is_mutable_buffer_byte_aligned = left_bit_offset == 0;
+
+    if is_mutable_buffer_byte_aligned {


is it worth special casing the case where both left_offset and right_offset are zero? In that case a simple loop that compared u64 by u64 is probably fastest (maybe even u128 🤔 )

in order to be u128 we would want to change the function to get u128, but I wanted to keep similar API to Buffer ops.

do you think we should change it?

and the special case can be added later

arrow-buffer/src/buffer/mutable_ops.rs

arrow-buffer/src/buffer/mutable/bitwise.rs

arrow-buffer/src/builder/boolean.rs

rluvaton · 2025-10-17T14:50:04Z

I will later change BooleanBufferBuilder#append_packed_range function to use mutable_bitwise_bin_op_helper as I saw that running the boolean_append_packed benchmark improved by 57%
boolean_append_packed   time:   [2.0079 µs 2.0139 µs 2.0202 µs]

                        change: [−57.808% −57.653% −57.494%] (p = 0.00 < 0.05)

                        Performance has improved.

You can change the code that I described

alamb · 2025-10-17T18:55:43Z

I plan to spend more time studying this PR tomorrow morning with a fresh pair of eyes

alamb · 2025-10-21T20:38:00Z

I am hoping to spend more time tomorrow reviewing this one carefully (specifically I want to use the new API and see some performance improvements)

arrow-buffer/src/buffer/mutable_ops.rs

alamb · 2025-10-24T13:01:04Z

I will later change BooleanBufferBuilder#append_packed_range function to use mutable_bitwise_bin_op_helper as I saw that running the boolean_append_packed benchmark improved by 57%
boolean_append_packed   time:   [2.0079 µs 2.0139 µs 2.0202 µs]

                        change: [−57.808% −57.653% −57.494%] (p = 0.00 < 0.05)

                        Performance has improved.
You can change the code that I described

I tried this and for some reason it fails the tests

Change BooleanBuffer::append_packed_len to use mutable_bitwise_bin_… rluvaton/arrow-rs#4

alamb

Thank you so much @rluvaton -- this is a feature that has been sorely missing in arrow-rs for a long time (optimized mutable bitwise operations with shifts) ❤️ ❤️ ❤️

Specifically, I think it will also assist operations where a selection mask must be continually narrowed down (aka applying AND)

I went through the low level implementations carefully and they look pretty good to me -- it was very nicely structured and commented. -- so while it took me a while it was a pleasant task.

One thing I would really like to update is the public facing APIs -- namely avoid the new traits, and make the APIs safer to use.

Move mutable_bitwise_unary_op_helper to a method on MutableBuffer
Move mutable_bitwise_bin_op_helper to a method on MutableBuffer
Remove the pub mutable_... functions and move their implementations in BufferBuilder
Move tests to be in terms of the bitop impls in MutableBuffer

I sketched how this would look in this PR:

rluvaton#3

I am happy to push commits to this PR to do so, but I wanted to check with you first.

I also think we need to show some significant performance improvements to justify adding this level of complexity / unsafe to the arrow-crate. I tried to get one here

rluvaton#4

But it isn't passing tests yet 🤔

alamb · 2025-10-24T12:20:28Z

arrow-buffer/src/buffer/mutable/bitwise.rs

+) where
+    F: FnMut(u64, u64) -> u64,
+{
+    // Must not reach here if we not byte aligned


As a follow on PR, it may be worth adding another special case for when the right is also byte aligned which I think will be a common case when applying operations in place on BooleanBuffers (I am looking ahead not just the current usecase)

arrow-buffer/src/buffer/mutable/bitwise.rs

arrow-buffer/src/buffer/mutable_ops.rs

alamb · 2025-10-24T12:38:25Z

arrow-buffer/src/buffer/mutable/bitwise.rs

+///
+/// This is the only place that uses unsafe code to read and write unaligned
+///
+struct U64UnalignedSlice<'a> {


Another potential (future) optimization is to handle any unaligned bytes and then switch to aligned 64 operations and then clean up with unaligned bytes.

We would have to have some benchmarks to see if it was worthwhile

arrow-buffer/src/buffer/mutable_ops.rs

alamb · 2025-10-24T12:47:39Z

arrow-buffer/src/buffer/mutable/bitwise.rs

+    // to preserve the bits outside the remainder
+    let rem = {
+        // 1. Read the byte that we will override
+        let current = start_remainder_mut_slice


this only reads 8 bits, but the comments (and logic) says that the remainder could be up to 63 bits (shouldn't this be reading multiple bytes if remainder_len is greater than 8? Perhaps via `get_remainder_bits 🤔

added comment, the slice is always at the length of ceil(reminder_len, 8) to make it that the last byte is always the boundary byte

arrow-buffer/src/buffer/mutable_ops.rs

alamb · 2025-11-03T14:08:11Z

Sure, tomorrow I will address the review

I am actively working on some changes to this PR (remove the bitwise ops, add some more docs). I will push commits in a few minutes

…olean-buffer-builder

…m:rluvaton/arrow-rs into add-bitwise-ops-to-boolean-buffer-builder

alamb

Thank you again so much @rluvaton -- I went through this PR again very carefully and:

Added more documentation and doc examples
Added some tests for error/out of bound conditions
Renamed the module to mutable/bitwise.rs

I think it is now ready to go

I will also file some follow on tickets to track

Adding a apply_unary_op method to BooleanBufferBuilder
Possibly adding something similar to BooleanArray

alamb · 2025-11-03T15:36:12Z

On the walk home, I was thinking we could almost make these two free functions in bit_util.rs -- and then we could apply them to MutableBuffer using MutableBuffer::as_mut_slice() 🤔

rluvaton · 2025-11-03T16:59:56Z

On the walk home, I was thinking we could almost make these two free functions in bit_util.rs -- and then we could apply them to MutableBuffer using MutableBuffer::as_mut_slice() 🤔

Fine by me, do you wanna push to this PR?

alamb · 2025-11-03T20:35:39Z

On the walk home, I was thinking we could almost make these two free functions in bit_util.rs -- and then we could apply them to MutableBuffer using MutableBuffer::as_mut_slice() 🤔

Fine by me, do you wanna push to this PR?

Yeah, I'll give it a try later tonight or tomorrow

alamb · 2025-11-05T16:52:30Z

starting to mess with this one now

…olean-buffer-builder

alamb

I moved all the code into bitwise_ops and I think we are heading in the right direction

The PR is smaller:

And we can do all the same operations directly on MutableBuffer and BufferBuilder without having to explose MutableBuffer at all

I also verified with my favorite test that this code is well covered>

cargo llvm-cov test --html --lib -p arrow-buffer

👍

I am going to do one final check that we can still use this API in this PR

#8744

But I am feeling really good about this PR now

alamb · 2025-11-05T17:49:04Z

arrow-buffer/src/util/bit_util.rs

+///
+/// # Example: Modify buffer with offsets
+/// ```
+/// # use arrow_buffer::MutableBuffer;


BTW this example shows how these methods can still be used to modify a MutableBuffer in place

But the functions are general and can work on any &mut[u8]

This also has the nice property that Rust prevents any mutation of the length or capacity so we don't need to assert anymore

alamb · 2025-11-05T18:42:05Z

The benchmarks on #8619 are (still) looking quite good

rluvaton added 9 commits October 12, 2025 16:48

add bitwise ops

c7d9267

add bitwise ops

d14e5b7

cleanup

739fe0a

pub(crate) as I don't like that we have both mutable and only left mu…

0e15b32

…table. but I don't want to pass slice of bytes as then I don't know the source and users must make sure that they hold the same promises as Buffer/MutableBuffer

start adding tests

c442299

add tests

2f28dc3

add trait for left

c4676a6

format

da03628

revert changes

652a256

github-actions bot added the arrow Changes to the arrow crate label Oct 15, 2025

rluvaton added 3 commits October 15, 2025 12:14

fix validation

0c29f0e

remove many unsafe and cleanup

bcd4863

format

6b7bfe9

alamb reviewed Oct 17, 2025

View reviewed changes

alamb added the performance label Oct 19, 2025

This was referenced Oct 23, 2025

perf: add optimized zip implementation for scalars #8653

Merged

Remove BufferSupportedRhs in favor of AsRef<[u8]> rluvaton/arrow-rs#1

Open

alamb reviewed Oct 24, 2025

View reviewed changes

arrow-buffer/src/buffer/mutable_ops.rs Outdated Show resolved Hide resolved

alamb reviewed Oct 24, 2025

View reviewed changes

rluvaton added 3 commits October 26, 2025 16:09

add reproduction test

aec92d6

extract, cleanup and add comments

db3e853

add comments

0a64bcb

alamb added 11 commits November 3, 2025 09:11

Merge remote-tracking branch 'apache/main' into add-bitwise-ops-to-bo…

ca621f8

…olean-buffer-builder

Update arrow-buffer/src/buffer/mutable_ops.rs

d63d72c

Merge branch 'add-bitwise-ops-to-boolean-buffer-builder' of github.co…

464e56c

…m:rluvaton/arrow-rs into add-bitwise-ops-to-boolean-buffer-builder

Revert changes to boolean

07679d7

Restore enough for the tests

bfdf381

Improve docs

246d4e2

Move into mutable module

b9acb34

Add example/doc tests

d590ee1

Add tests for out of bounds

ccf266f

Add tests for unary ops

005c444

Add panic doc

3a8e760

alamb approved these changes Nov 3, 2025

View reviewed changes

alamb changed the title ~~feat: add bitwise ops for BooleanBufferBuilder and for MutableBuffer~~ feat: add MutableBuffer::apply_unary_op and MutableBuffer::apply_binary_op Nov 3, 2025

fmt

cf52bdf

alamb mentioned this pull request Nov 4, 2025

Andrew Lamb Weekly-ish Open Source plan - 2025-11-03 apache/datafusion#18486

Open

34 tasks

alamb added 3 commits November 5, 2025 12:10

Move buffer modification to bit_utils

6dbed0b

Move tests and remove changes to MutableBufer

9ca7e45

Merge remote-tracking branch 'apache/main' into add-bitwise-ops-to-bo…

5cb50d5

…olean-buffer-builder

alamb reviewed Nov 5, 2025

View reviewed changes

Update docs

379d1ec

alamb changed the title ~~feat: add MutableBuffer::apply_unary_op and MutableBuffer::apply_binary_op~~ feat: add apply_unary_op and apply_binary_op bitwise operations Nov 5, 2025

alamb reviewed Nov 5, 2025

View reviewed changes

fix docs

1fb4981

alamb mentioned this pull request Nov 5, 2025

Alamb/test boolean kernels #8793

Draft


		let is_mutable_buffer_byte_aligned = left_bit_offset == 0;

		if is_mutable_buffer_byte_aligned {

feat: add apply_unary_op and apply_binary_op bitwise operations #8619

Are you sure you want to change the base?

feat: add apply_unary_op and apply_binary_op bitwise operations #8619

Conversation

rluvaton commented Oct 15, 2025 • edited by alamb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Implementation notes

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb commented Oct 16, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rluvaton Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rluvaton Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rluvaton commented Oct 17, 2025

Uh oh!

alamb commented Oct 17, 2025

Uh oh!

alamb commented Oct 21, 2025

Uh oh!

Uh oh!

alamb commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alamb commented Nov 3, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb commented Nov 3, 2025

Uh oh!

rluvaton commented Nov 3, 2025

Uh oh!

alamb commented Nov 3, 2025

feat: add `apply_unary_op` and `apply_binary_op` bitwise operations #8619

feat: add `apply_unary_op` and `apply_binary_op` bitwise operations #8619

rluvaton commented Oct 15, 2025 •

edited by alamb

Loading

rluvaton Oct 17, 2025 •

edited

Loading

rluvaton Oct 18, 2025 •

edited

Loading

alamb commented Oct 24, 2025 •

edited

Loading

alamb left a comment •

edited

Loading