Skip to content

Conversation

@a10y
Copy link
Contributor

@a10y a10y commented Oct 17, 2025

Turns out, arrow-rs/DataFusion lacks complete support for ListView. Until that is resolved, for now the default to_arrow_preferred should prefer returning List over ListView.

See https://github.com/vortex-data/vortex/actions/runs/18591096514/job/53006051293?pr=4977 for example failure

DataFusion lacks complete support for ListView. Until that is resolved,
for now the default to_arrow_preferred should prefer returning List
over ListView.

Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y a10y added the changelog/fix A bug fix label Oct 17, 2025
@connortsui20
Copy link
Contributor

connortsui20 commented Oct 17, 2025

I'm actually somewhat surprised that only 1 test failed since just changing the arrow dtype should also require that we need to make the conversions properly?

Edit: Oh right we match on the vortex and arrow type and use the correct conversion function

Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y
Copy link
Contributor Author

a10y commented Oct 17, 2025

This is just the preferred conversion. Anything that has an explicit conversion should still go to ListView

// (32-bit), Large List View (64-bit). We cannot both guarantee zero-copy and commit to an
// Arrow dtype because we do not how large our offsets are.
DType::List(elem_dtype, _) => DataType::ListView(FieldRef::new(Field::new_list_field(
DType::List(elem_dtype, _) => DataType::List(FieldRef::new(Field::new_list_field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this deserves a comment noting that we are using List over ListView because datafusion doesn't really support it yet?

@codecov
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.27%. Comparing base (9b7f792) to head (4774c64).
⚠️ Report is 1 commits behind head on develop.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@robert3005
Copy link
Contributor

For posterity, it falls through here https://github.com/apache/arrow-rs/blob/main/arrow-select/src/filter.rs#L309 (where there could be an optimised impl for listviews) and then panics in https://github.com/apache/arrow-rs/blob/main/arrow-data/src/transform/mod.rs#L269

@a10y a10y merged commit 3ca875d into develop Oct 17, 2025
41 checks passed
@a10y a10y deleted the aduffy/canonicalize-list branch October 17, 2025 14:34
@a10y
Copy link
Contributor Author

a10y commented Oct 17, 2025

Yep, I'll push a fix to arrow soon

@a10y
Copy link
Contributor Author

a10y commented Oct 20, 2025

Fix to arrow: apache/arrow-rs#8645

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/fix A bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants