Skip to content

Conversation

@ding-young
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Yes. Existing test already handles case for array of size 1

/// for each row in `arr`:
/// 1. convert to a `ScalarValue`
/// 2. Convert `ScalarValue` back to an `ArrayRef`
/// 3. Compare the original array (sliced) and new array for equality
fn round_trip_through_scalar(arr: ArrayRef) {
for i in 0..arr.len() {
// convert Scalar --> Array
let scalar = ScalarValue::try_from_array(&arr, i).unwrap();
let array = scalar.to_array_of_size(1).unwrap();
assert_eq!(array.len(), 1);
assert_eq!(array.data_type(), arr.data_type());
assert_eq!(array.as_ref(), arr.slice(i, 1).as_ref());
}
}

Are there any user-facing changes?

@github-actions github-actions bot added the common Related to common crate label Jul 7, 2025
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ding-young looks good
can this PR also have a bench file or benchmark run results to understand the optimization in numbers?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ding-young -- makes sense to me

@alamb
Copy link
Contributor

alamb commented Jul 7, 2025

FYI @findepi

@ding-young
Copy link
Contributor Author

Thanks @ding-young looks good can this PR also have a bench file or benchmark run results to understand the optimization in numbers?

Thank you @comphead

When I run cargo bench --bench to_array_size_of with criterion bench script (in main...ding-young:datafusion:scalar-struct-test), perf has improved compared to main branch.

I didn't add the bench script in this PR because I think we won't run this bench again after it gets merged.

image

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement, thank you

@xudong963 xudong963 merged commit 5db5fbc into apache:main Jul 8, 2025
27 checks passed
xudong963 pushed a commit to massive-com/arrow-datafusion that referenced this pull request Jul 17, 2025
@alamb alamb added the performance Make DataFusion faster label Jul 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate performance Make DataFusion faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize ColumnarValue::into_array / ScalarValue::to_array / ScalarValue::to_array_of_size

4 participants