-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-3665][GraphX] Java API for GraphX #3234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adds JavaGraph, JavaVertexRDD, JavaEdgeRDD, and tests for these classes.
Test build #23285 has finished for PR 3234 at commit
|
Git fetch failure. Jenkins, retest this please |
Test build #23289 has finished for PR 3234 at commit
|
Test build #23288 has finished for PR 3234 at commit
|
Ok since this is so big, we should probably put this for 1.3 now :) |
Hi @ankurdave is this still targeted for 1.4 now? |
Would also like to know the status. This looks to be the last piece to graduate GraphX from alpha to stable re: SPARK-3664. |
@brennonyork graphx has already graduated from alpha. We will make sure this goes into 1.4. |
Mind tagging this with [GRAPHX] so it can get sorted properly? |
I had to add the Junit dependency in graphx/pom.xml to compile. Did you see this issue? We might have to update the pom file. -Kushal. |
@ankurdave what happened to this? should this PR be closed? |
Hey @ankurdave, it looks like this PR has gone stale, so do you mind closing it for now and re-opening it once you're ready to work on it again? |
Hi Josh, Ankur, We have a dependence on this story for PyGraphX. We are close to completing the PyGraphX work and would like this code to be reviewed and approved. Thanks, From: Josh Rosen [mailto:notifications@github.com] Hey @ankurdavehttps://github.com/ankurdave, it looks like this PR has gone stale, so do you mind closing it for now and re-opening it once you're ready to work on it again? — |
(For Jenkins:) Do you mind closing this PR then? |
Don't mind at all. We will submit the design doc of PyGraphX by early next week. Will close this PR after. |
@JoshRosen @srowen Sorry I didn't see your requests to close earlier. Closing now. |
…42.7.4 and `mssql` to 12.8.1.jre11 ### What changes were proposed in this pull request? This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11. ### Why are the changes needed? 1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html): - [Issue #3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause - [Issue #4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230) - [Issue #3982](h2database/h2database#3982): Potential issue when using ROUND - [Issue #3894](h2database/h2database#3894): Race condition causing stale data in query last result cache - [Issue #4075](h2database/h2database#4075): infinite loop in compact - [Issue #4091](h2database/h2database#4091): Wrong case with linked table to postgresql - [Issue #4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs 2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/): - fix: PgInterval ignores case for represented interval string [PR #3344](pgjdbc/pgjdbc#3344) - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR #3295](pgjdbc/pgjdbc#3295) - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR #3304](pgjdbc/pgjdbc#3304) - fix: Ensure order of results for getDouble [PR #3301](pgjdbc/pgjdbc#3301) - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR #3248](pgjdbc/pgjdbc#3248) - fix: Fix SSL tests [PR #3260](pgjdbc/pgjdbc#3260) - fix: Support bytea in preferQueryMode=simple [PR #3243](pgjdbc/pgjdbc#3243) - fix: Fix [Issue #3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR #3235](pgjdbc/pgjdbc#3235) - fix: Fix [Issue #3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR #3225](pgjdbc/pgjdbc#3225) 3. For `mssql`, there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1): - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR #2492](microsoft/mssql-jdbc#2492) - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR #2493](microsoft/mssql-jdbc#2493) - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR #2494](microsoft/mssql-jdbc#2494) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47810 from wayneguow/ug_h2. Authored-by: Wei Guo <guow93@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>
…42.7.4 and `mssql` to 12.8.1.jre11 ### What changes were proposed in this pull request? This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11. ### Why are the changes needed? 1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html): - [Issue apache#3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause - [Issue apache#4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230) - [Issue apache#3982](h2database/h2database#3982): Potential issue when using ROUND - [Issue apache#3894](h2database/h2database#3894): Race condition causing stale data in query last result cache - [Issue apache#4075](h2database/h2database#4075): infinite loop in compact - [Issue apache#4091](h2database/h2database#4091): Wrong case with linked table to postgresql - [Issue apache#4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs 2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/): - fix: PgInterval ignores case for represented interval string [PR apache#3344](pgjdbc/pgjdbc#3344) - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR apache#3295](pgjdbc/pgjdbc#3295) - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR apache#3304](pgjdbc/pgjdbc#3304) - fix: Ensure order of results for getDouble [PR apache#3301](pgjdbc/pgjdbc#3301) - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR apache#3248](pgjdbc/pgjdbc#3248) - fix: Fix SSL tests [PR apache#3260](pgjdbc/pgjdbc#3260) - fix: Support bytea in preferQueryMode=simple [PR apache#3243](pgjdbc/pgjdbc#3243) - fix: Fix [Issue apache#3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR apache#3235](pgjdbc/pgjdbc#3235) - fix: Fix [Issue apache#3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR apache#3225](pgjdbc/pgjdbc#3225) 3. For `mssql`, there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1): - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR apache#2492](microsoft/mssql-jdbc#2492) - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR apache#2493](microsoft/mssql-jdbc#2493) - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR apache#2494](microsoft/mssql-jdbc#2494) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47810 from wayneguow/ug_h2. Authored-by: Wei Guo <guow93@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>
…42.7.4 and `mssql` to 12.8.1.jre11 ### What changes were proposed in this pull request? This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11. ### Why are the changes needed? 1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html): - [Issue apache#3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause - [Issue apache#4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230) - [Issue apache#3982](h2database/h2database#3982): Potential issue when using ROUND - [Issue apache#3894](h2database/h2database#3894): Race condition causing stale data in query last result cache - [Issue apache#4075](h2database/h2database#4075): infinite loop in compact - [Issue apache#4091](h2database/h2database#4091): Wrong case with linked table to postgresql - [Issue apache#4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs 2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/): - fix: PgInterval ignores case for represented interval string [PR apache#3344](pgjdbc/pgjdbc#3344) - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR apache#3295](pgjdbc/pgjdbc#3295) - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR apache#3304](pgjdbc/pgjdbc#3304) - fix: Ensure order of results for getDouble [PR apache#3301](pgjdbc/pgjdbc#3301) - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR apache#3248](pgjdbc/pgjdbc#3248) - fix: Fix SSL tests [PR apache#3260](pgjdbc/pgjdbc#3260) - fix: Support bytea in preferQueryMode=simple [PR apache#3243](pgjdbc/pgjdbc#3243) - fix: Fix [Issue apache#3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR apache#3235](pgjdbc/pgjdbc#3235) - fix: Fix [Issue apache#3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR apache#3225](pgjdbc/pgjdbc#3225) 3. For `mssql`, there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1): - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR apache#2492](microsoft/mssql-jdbc#2492) - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR apache#2493](microsoft/mssql-jdbc#2493) - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR apache#2494](microsoft/mssql-jdbc#2494) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47810 from wayneguow/ug_h2. Authored-by: Wei Guo <guow93@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>
…42.7.4 and `mssql` to 12.8.1.jre11 ### What changes were proposed in this pull request? This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11. ### Why are the changes needed? 1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html): - [Issue apache#3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause - [Issue apache#4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230) - [Issue apache#3982](h2database/h2database#3982): Potential issue when using ROUND - [Issue apache#3894](h2database/h2database#3894): Race condition causing stale data in query last result cache - [Issue apache#4075](h2database/h2database#4075): infinite loop in compact - [Issue apache#4091](h2database/h2database#4091): Wrong case with linked table to postgresql - [Issue apache#4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs 2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/): - fix: PgInterval ignores case for represented interval string [PR apache#3344](pgjdbc/pgjdbc#3344) - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR apache#3295](pgjdbc/pgjdbc#3295) - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR apache#3304](pgjdbc/pgjdbc#3304) - fix: Ensure order of results for getDouble [PR apache#3301](pgjdbc/pgjdbc#3301) - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR apache#3248](pgjdbc/pgjdbc#3248) - fix: Fix SSL tests [PR apache#3260](pgjdbc/pgjdbc#3260) - fix: Support bytea in preferQueryMode=simple [PR apache#3243](pgjdbc/pgjdbc#3243) - fix: Fix [Issue apache#3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR apache#3235](pgjdbc/pgjdbc#3235) - fix: Fix [Issue apache#3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR apache#3225](pgjdbc/pgjdbc#3225) 3. For `mssql`, there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1): - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR apache#2492](microsoft/mssql-jdbc#2492) - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR apache#2493](microsoft/mssql-jdbc#2493) - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR apache#2494](microsoft/mssql-jdbc#2494) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47810 from wayneguow/ug_h2. Authored-by: Wei Guo <guow93@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>
Adds JavaGraph (which also contains methods from GraphOps), JavaVertexRDD, JavaEdgeRDD, and tests for these classes.