-
Notifications
You must be signed in to change notification settings - Fork 489
chore: upgrade datafusion to 50.0.0 and arrow to 56.1 #4751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bbfcabc to
a29bc29
Compare
|
@ddupg Could you help fix the build issues? |
@LuQQiu Sure, now blocked in datafusion-python is not published to pypi, but it may take a few weeks apache/datafusion#16799 (comment) |
Hmm...we use the FFI for our boundary between datafusion-python and lance rust which in theory has a stable ABI. Do we know why we have a hard pin on the datafusion-python version? Do the tests pass if we loosen the pin? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4751 +/- ##
==========================================
+ Coverage 80.85% 81.38% +0.52%
==========================================
Files 333 333
Lines 131944 130020 -1924
Branches 131944 130020 -1924
==========================================
- Hits 106687 105811 -876
+ Misses 21508 20653 -855
+ Partials 3749 3556 -193
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| let mut added_byte_size = stats | ||
| .num_rows | ||
| .map(|n| (n * 8).max(64)) | ||
| .map(|n| n * 8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closes: apache/arrow-rs#7383
Related to #4577, maybe Json type has a requirement for datafusion version.
The currently failing tests are not related to datafusion version, I'm trying to fix it. |
|
The failing test python/tests/test_scalar_index.py::test_null_handling is seems to be caused by a bug of DataFusion rust 50.0.0, and I have raised an issue apache/datafusion#17681 to the DataFusion |
This is not a bug with DataFusion, it seems to be because Lance using indexes and DataFusion have different logic on how to handle NULL. After apache/datafusion#17088, datafusion simplifies the handling of NULL expressions, which affects the results of the test python/tests/test_scalar_index.py::test_null_handling. In rust/lance-index/src/scalar/expression.rs:create_filter_plan https://github.com/lancedb/lance/blob/5c60975b2c032314304ca1d38865d6eefde4d790/rust/lance-index/src/scalar/expression.rs#L1560, if the filter executed is but after and then Can @westonpace please give me some help on how to fix this test? |
|
This appears to be a duplicate of my PR: #4778 but since you're actively working on the issue I was just looking at with the null unit test failure, I can close my PR in favor of this one. |
|
Thank you @timsaucer for sharing this. I also chatted with @ddupg offline a bit, sorry I forgot to share the analysis here. I think we did not handle the case of a In the past because the right hand side is a |
I do want to point out that "fixing" the issue by handling the literal null boolean will still give different results than an equivalent SQL command. From a SQL perspective not returning the row with the null IS the correct behavior, even though I think it is counter intuitive. |
|
Thank @jackye1995 @timsaucer both for sharing and thinking, which helped me a lot
Agreed. My personal opinion is that root cause stems from the different semantics of handling NULL when indexing is enabled or not. This is weird behavior, so I think we should make semantics consistent first. I proposed a draft #4812 to solve this problem. This will fix the currently failed test. Secondly, the DataFusion upgrade exposed a situation where the index is not working as expected. DF 50 optimizes Expr of NULL handling, for example, the Expr |
|
I agree that we should migrate to the SQL behavior and I agree with @ddupg's summary. Shall I make a PR to fix the second issue? |
|
Also, there is a third issue, reported here, which will unfortunately be quite a bit trickier to fix. I don't think that one needs to block the DF upgrade however. |
c6493af to
a28bf28
Compare
|
looks like there are some additional errors, any clue? |
This unusual local path |
|
hmm this is quite strange, I don't see any change that could affect the windows path handling... any clue? @westonpace @wjones127 @timsaucer |
I think I've seen this happen in CI in this repo before, but having trouble finding an exact example from my PRs. In the past rerunning CI has resolved it. |
1597a63 to
ccac35f
Compare
|
These new failures I'm also seeing on PRs that are completely unrelated. example: https://github.com/lancedb/lance/actions/runs/18080818538/job/51443808660?pr=4834 |
Agreed, the test_zonemap_zone_size test was fixed on main |
|
I'll see if I can make sense of the Windows failures or at least see what changed |
9ed1a11 to
0cd5af9
Compare
|
I rebased the main branch code and reran the tests multiple times. This error consistently occurs in Windows. |
|
Tracked down the root cause at least. Still working on fix. The upgrade to DF caused a transitive dependency ( Anyways, I'm sure there is just some magic incantation of path manipulation that has to happen. My trial and error continues 🫠 |
| version = "2.5.7" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "32f8b686cadd1473f4bd0117a5d28d36b1ade384ea9b5069a1c40aefed7fda60" | ||
| checksum = "08bc136a29a3d1758e07a9cca267be308aeebf5cfd5a10f3f67ab2097683ef5b" | ||
| dependencies = [ | ||
| "form_urlencoded", | ||
| "idna", | ||
| "percent-encoding", | ||
| "serde", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are issues filed upstream in both object-store and url crates, can we simply revert these lines of the lock file? Or we can pin url in the cargo.toml to 2.5.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Datafusion asks for 2.5.7 specifically so we can't revert sadly 😢
|
I've got a fix (more of a workaround) in progress that avoids these tricky paths by reworking how we do temporary directories: #4860 |
| enable_options_value_normalization: false, | ||
| collect_spans: false, | ||
| map_string_types_to_utf8view: false, | ||
| default_null_ordering: NullOrdering::NullsMax, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know if this was the same as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the PR apache/datafusion#16963, nulls_max is the default behavior.
| version = "2.5.7" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "32f8b686cadd1473f4bd0117a5d28d36b1ade384ea9b5069a1c40aefed7fda60" | ||
| checksum = "08bc136a29a3d1758e07a9cca267be308aeebf5cfd5a10f3f67ab2097683ef5b" | ||
| dependencies = [ | ||
| "form_urlencoded", | ||
| "idna", | ||
| "percent-encoding", | ||
| "serde", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Datafusion asks for 2.5.7 specifically so we can't revert sadly 😢
|
The Windows issue is fixed in #4860 if anyone is up for a review (I'm shamelessly 🎣 for reviewers) |
|
Ok, merged. If you rebase then hopefully you are good to go. |
Change-Id: Iedc2a3133d2caf8f9630edde5723618a244cb958
Change-Id: Ic0a4e492eb128316df95d04e3780547100b93de1
Change-Id: Iddf384fafe1a59157b780eea0ffa8e037599d372
Change-Id: I833aa60aee64fd06907c1de10a973d3db7c93ec3
0cd5af9 to
77f4aa7
Compare
Thank @westonpace very much for the fix, all the tests have passed after I rebased |
|
Could @jackye1995 @westonpace @timsaucer please help review this? |
jackye1995
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for pushing this through!
westonpace
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 🎉 🎉


This PR is to update the following dependencies:
I am working on supporting GEO type #4678 , which is blocked by the DataFusion upgrade. Recently, DataFusion released 50 apache/datafusion#16799 , so follow up on the upgrade
Closes #4753