-
Notifications
You must be signed in to change notification settings - Fork 1k
Remove primitive map key assertion on record reader #7769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Remove primitive map key assertion on record reader #7769
Conversation
2d8b0b9
to
ea6d5f6
Compare
ea6d5f6
to
edd9b70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for correcting this oversight @njaremko. FWIW the arrow API handles the test file as expected. Just a few nits with the test.
} | ||
|
||
#[test] | ||
fn test_file_reader_rows_nullable1() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more descriptive name here would be nice...test_compound_map_key
perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, whoops, yeah, definitely need to fix this
|
||
#[test] | ||
fn test_file_reader_rows_nullable1() { | ||
let rows = test_file_reader_rows("databricks.parquet", None).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, something more descriptive than "databricks".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was named this because it's a file generated by databricks with every supported column type, I'll look into regenerating it
] | ||
), | ||
( | ||
"map_nested".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the meat of the PR; would it be possible to only have this column in the test file? This is a lot to slog through and obscures the point of the test.
I'll update this PR with the requested changes once this gets merged: apache/parquet-testing#87 |
…_assertion_on_record_reader
THanks again @njaremko Given that we don't really control the timeline of the upstream parquet-testing repo, if you want to unblock this PR you could potentially copy the test file into the arrow-rs repo with a TODO comment that says to change to use the copy upstream in parquet-testing when apache/parquet-testing#87 is merged |
Marking as draft as I think this PR is no longer waiting on feedback and I am trying to make it easier to find PRs in need of review. Please mark it as ready for review when it is ready for another look |
Note:
This PR has a test that requires a file that needs to be upstreamed
Which issue does this PR close?
Rationale for this change
There's no such requirement in the parquet logical type specification for map types. Spark used to have a similar assertion, that they've since removed
What changes are included in this PR?
Getting rid of the assertion, and adding a test
Are there any user-facing changes?
No
If there are any breaking changes to public APIs, please call them out.