-
Notifications
You must be signed in to change notification settings - Fork 318
Updated the default partitioner to the AutoBucketPartitioner #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It uses the $bucketAuto aggregation stage and works for compound keys as well as single keys. SPARK-444
...ionTest/java/com/mongodb/spark/sql/connector/read/partitioner/AutoBucketPartitionerTest.java
Show resolved
Hide resolved
@NathanQingyangXu no need to download the code / run locally, you can rely on evergreen. |
Reopened PR - as no partitioners will support duplicated compound key values, I have added a warning for users if they have an older MongoDB version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add two initial comments for you to consider; will review more thoroughly afterwards
...ntegrationTest/java/com/mongodb/spark/sql/connector/mongodb/MongoSparkConnectorTestCase.java
Outdated
Show resolved
Hide resolved
src/main/java/com/mongodb/spark/sql/connector/read/partitioner/AutoBucketPartitioner.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great! I left some minor comments for you to consider.
src/main/java/com/mongodb/spark/sql/connector/read/partitioner/AutoBucketPartitioner.java
Outdated
Show resolved
Hide resolved
...ionTest/java/com/mongodb/spark/sql/connector/read/partitioner/AutoBucketPartitionerTest.java
Show resolved
Hide resolved
Added new test cases. Although strangely nested compound keys work pre 7.0, its better to hard fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
It uses the $bucketAuto aggregation stage and works for compound keys as well as single keys.
SPARK-444