Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling make_array() on struct literal causes panic #8867

Closed
devinjdangelo opened this issue Jan 15, 2024 · 3 comments · Fixed by #8884
Closed

Calling make_array() on struct literal causes panic #8867

devinjdangelo opened this issue Jan 15, 2024 · 3 comments · Fixed by #8884
Labels
bug Something isn't working

Comments

@devinjdangelo
Copy link
Contributor

Describe the bug

Attempting to create a literal value which is an array of structs causes a panic.

To Reproduce

DataFusion CLI v34.0.0
❯ set datafusion.execution.parquet.allow_single_file_parallelism=false;
0 rows in set. Query took 0.001 seconds.

❯ copy (values (make_array(struct('foo',1)))) to '/tmp/test.parquet';
thread 'main' panicked at /home/dev/.cargo/registry/src/index.crates.io-6f17d22bba15001f/datafusion-common-34.0.0/src/scalar.rs:524:74:
called `Result::unwrap()` on an `Err` value: Internal("Unsupported data type in hasher: Struct([Field { name: \"c0\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"c1\", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Expected behavior

Should not result in panic. If there is some issue with how I have attempted to construct the array of struct, we could add a more descriptive error message.

Additional context

I came across this while writing tests related to #8551, but this issue does not appear related to parallel parquet writing specifically.

The line of code generating the error implies an internal constraint is violated if we reach it.

https://github.com/apache/arrow-datafusion/blob/1dcdcd431187178d736cdd3a6c004204aa2faa14/datafusion/common/src/hash_utils.rs#L369-L374

@jayzhan211
Copy link
Contributor

jayzhan211 commented Jan 15, 2024

I run on main branch and does not reproduce the panic.

statement ok
set datafusion.execution.parquet.allow_single_file_parallelism=false;

query ?
copy (values (make_array(struct('foo',1)))) to '/tmp/test.parquet';
----
1

@alamb
Copy link
Contributor

alamb commented Jan 15, 2024

Perhaps it was fixed by #8552

@devinjdangelo
Copy link
Contributor Author

Thanks @jayzhan211 you are right. After merging main into my dev branch, the issue is fixed.

I did also notice #8873 which is why I tried make_array, so thanks for filing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants