Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable columnar shuffle by default #95

Closed
viirya opened this issue Feb 22, 2024 · 0 comments · Fixed by #250
Closed

Enable columnar shuffle by default #95

viirya opened this issue Feb 22, 2024 · 0 comments · Fixed by #250
Assignees
Labels
enhancement New feature or request

Comments

@viirya
Copy link
Member

viirya commented Feb 22, 2024

What is the problem the feature request solves?

COMET_COLUMNAR_SHUFFLE_ENABLED is the config Comet uses to decide if columnar shuffle should be used instead of native shuffle, if comet shuffle is enabled.

Because columnar shuffle covers more use cases (hash, range and single partitioning) than native shuffle. We should change this config to prefer columnar shuffle.

Note that this is blocked by the Java Arrow issue: apache/arrow#40038 (PR: apache/arrow#40043)

Currently if columnar shuffle is enabled, running Comet with TPCDS queries will get the error:

General execution error with reason org.apache.comet.CometNativeException: Fail to process Arrow array with reason C Data interface error: The external buffer at position 1 is null...

Describe the potential solution

No response

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant