-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions #1763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can one of the admins verify this patch? |
ok to test |
QA tests have started for PR 1763. This patch merges cleanly. |
QA results for PR 1763: |
Sorry for the delay on this. It would be great if the PR also added a unit test to reproduce the bug. I can add that if you don't have time. |
If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw: java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions Author: luluorta <luluorta@gmail.com> Closes #1763 from luluorta/fix-graph-zip and squashes the following commits: 8338961 [luluorta] fix GraphX EdgeRDD zipPartitions (cherry picked from commit 9b225ac) Signed-off-by: Ankur Dave <ankurdave@gmail.com>
If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw: java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions Author: luluorta <luluorta@gmail.com> Closes #1763 from luluorta/fix-graph-zip and squashes the following commits: 8338961 [luluorta] fix GraphX EdgeRDD zipPartitions (cherry picked from commit 9b225ac) Signed-off-by: Ankur Dave <ankurdave@gmail.com>
Thanks! I added a test, verified that it failed before and succeeds now, and merged this into master, branch-1.1, and branch-1.0. |
If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw: java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions Author: luluorta <luluorta@gmail.com> Closes apache#1763 from luluorta/fix-graph-zip and squashes the following commits: 8338961 [luluorta] fix GraphX EdgeRDD zipPartitions
If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw: java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions Author: luluorta <luluorta@gmail.com> Closes apache#1763 from luluorta/fix-graph-zip and squashes the following commits: 8338961 [luluorta] fix GraphX EdgeRDD zipPartitions
@ankurdave Ithink you miss this PR when you [Extract interfaces for EdgeRDD and VertexRDD[(https://github.com//pull/2530). SPARK-2823 was reopened due to this. |
Thanks, @Earne Actually we already had a method to customize the partition number of EdgeRDD by using I guess the better name for the param of |
If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw:
java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions