Skip to content

Fix: Map functions crash on out of bounds cases #16203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

krishvishal
Copy link

Which issue does this PR close?

Rationale for this change

This PR fixes a panic crash in array indexing operations when accessing elements from map_values() results that contain structs with Null-typed fields.

What changes are included in this PR?

  • datafusion/functions-nested/src/extract.rs: Added null field detection and safe handling

Are these changes tested?

Yes - unit tests are added covering:

  • Main crash scenario: Struct with Null-typed fields
  • Mixed null/valid fields, out-of-bounds access

Are there any user-facing changes?

No, only behavior change.

Detect if array elements contain structs with null-typed fields and
emit a proper null array instead of trying to extract malformed
data. This prevents Arrow's NullArray buffer validation from
panicking.
1. Reproduce the state in the issue and test the fix has worked.
2. Case where the array has mixed null fields.
3. Case with out-of-bounds array access.
@comphead comphead requested a review from Copilot May 28, 2025 15:14
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR resolves a crash occurring during array indexing when map functions encounter structs with Null-typed fields. Key changes include adding detection of structs containing Null fields in extract.rs and returning a properly typed null array, along with comprehensive tests for the crash scenario, mixed null/valid fields, and out-of-bounds access.

Comment on lines 232 to 236
let mut null_array_data = Vec::with_capacity(array.len());
for _ in 0..array.len() {
null_array_data.push(None::<i32>);
}

Copy link
Preview

Copilot AI May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'null_array_data' is initialized and populated but never used, which may lead to confusion. Consider removing this unused code to simplify the function.

Suggested change
let mut null_array_data = Vec::with_capacity(array.len());
for _ in 0..array.len() {
null_array_data.push(None::<i32>);
}

Copilot uses AI. Check for mistakes.

@@ -225,6 +225,23 @@ where
return Ok(Arc::new(NullArray::new(array.len())));
}

if let DataType::Struct(fields) = values.data_type() {
let has_null_fields = fields.iter().any(|field| field.data_type() == &Null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please help me to understand why we check by structure field types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, now I realize the problem happens only if the struct has with with null as the value

let has_null_fields = fields.iter().any(|field| field.data_type() == &Null);
if has_null_fields {
// Instead of trying to extract from malformed struct data and return appropriate nulls
let mut null_array_data = Vec::with_capacity(array.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to return I think we can return NullArray like here?

    if values.data_type().is_null() {
         return Ok(Arc::new(NullArray::new(array.len())));
     }

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @krishvishal
Please add SQL tests as well to array.slt file
what comes to my mind is

select [named_struct('a', 1, 'b', null)][0];
select [named_struct('a', 1, 'b', null)][1];
select [named_struct('a', 1, 'b', null)][-1];
select [named_struct('a', 1, 'b', null)][2]
select map_values(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[0] 
select map_values(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[1]
select map_values(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[-1]
select map_values(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[2]   
select map_keys(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[0] 
select map_keys(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[1]
select map_keys(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[-1]
select map_keys(map([named_struct('a', 1, 'b', null)], [named_struct('a', 2, 'b', null)]))[2]   
;

@comphead
Copy link
Contributor

comphead commented Jun 2, 2025

@krishvishal are you still planning to wrap this PR up?

returning all nulls

Improves upon the previous fix that detected null-typed fields in
structs but incorrectly returned all null results. Now properly
extracts valid struct elements while handling null-typed fields
gracefully.

Changes:
- Replace blanket null return with handle_struct_with_null_fields()
- Process each struct field individually using MutableArrayData for valid fields
- Create proper NullArrays only for null-typed fields
- Maintain correct null buffer for the overall result

This allows queries like [named_struct('a', 1, 'b', null)][1] to correctly
return {a: 1, b: NULL} instead of NULL.

Fixes: apache#16187
acess of struct arrays with null fields
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jun 3, 2025
@krishvishal
Copy link
Author

krishvishal commented Jun 3, 2025

@comphead I've changed the implementation a bit to handle nulls properly. Previous fix just outputs NULL for queries like select [named_struct('a', 1, 'b', null)][1]; instead of {a: 1, b: NULL} .

I've also added tests to .array.slt file.

I think its ready.

Apologies for the delay.

@comphead
Copy link
Contributor

comphead commented Jun 3, 2025

Thanks @krishvishal the latest version becomes much more complicated compared to prev one. This can be a subject to check the performance.

What is the reason for adding the specific handler, some tests not passing?

@krishvishal
Copy link
Author

@comphead I've had to add the handler because the previous fix caused wrong behavior.

For example the following query currently returns:

> select [named_struct('a', 1, 'b', null)][1];

+-----------------------------------------------------------------------+
| make_array(named_struct(Utf8("a"),Int64(1),Utf8("b"),NULL))[Int64(1)] |
+-----------------------------------------------------------------------+
| {a: 1, b: NULL}                                                       |
+-----------------------------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.007 seconds.

This is correct. But my previous fix (commit: d9a699f) returned NULL. So to fix the behavior while also handling out of bounds indexes, I've had to add this handler.

for (row_index, offset_window) in array.offsets().windows(2).enumerate() {
let start = offset_window[0];
let end = offset_window[1];
let len = end - start;

// array is null
if len == O::usize_as(0) {
if array.is_null(row_index) || len == O::usize_as(0) {
Copy link
Contributor

@comphead comphead Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems these 2 if statements can be merged into 1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding

is_null() checks if an element at an index is NULL.

2nd condition checks if the element is an empty list.

They are checking different things right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right I had a feeling we can merge them like

if array.is_null(row_index) || len == O::usize_as(0) || indexes.is_null(row_index) {
             mutable.extend_nulls(1);
             continue;
}

Copy link
Author

@krishvishal krishvishal Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I understand what you mean. Yes they can be merged.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

i64: TryInto<O>,
{
let index: O = index.try_into().map_err(|_| {
DataFusionError::Execution(format!("array_element got invalid index: {index}"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DataFusionError::Execution(format!("array_element got invalid index: {index}"))
exec_err!("array_element got invalid index: {index}")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- merge conditionals
- use `exec_err!`
- Fix warning raised by clippy
@krishvishal
Copy link
Author

@comphead I've pushed additional fixes you've asked. I think the PR is ready. Let me know if it is okay.

@comphead
Copy link
Contributor

comphead commented Jun 4, 2025

Thanks @krishvishal let me quickly double check if we can do it with less data massaging. Since this will happen on execution layer it would be called for every batch of data possibly hitting the performance

@krishvishal
Copy link
Author

@comphead, can you please tell if there is something I can do to move this forward? Are there any relevant benches I could either run or adapt for this case?

@comphead
Copy link
Contributor

@comphead, can you please tell if there is something I can do to move this forward? Are there any relevant benches I could either run or adapt for this case?

Sorry I didn't have a chance to look into that, I'm planning to do it this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

List of structures crashes on out of bounds cases if struct has null value
2 participants