Skip to content

SPARK-5019 [MLlib] - GaussianMixtureModel exposes instances of MultivariateGauss... #4088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

tgaloppo
Copy link
Contributor

This PR modifies GaussianMixtureModel to expose instances of MutlivariateGaussian rather than separate mean and covariance arrays.

@SparkQA
Copy link

SparkQA commented Jan 17, 2015

Test build #25707 has started for PR 4088 at commit 091e8da.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 17, 2015

Test build #25707 has finished for PR 4088 at commit 091e8da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25707/
Test PASSed.

@tgaloppo tgaloppo changed the title SPARK-5019 - GaussianMixtureModel exposes instances of MultivariateGauss... SPARK-5019 [MLlib] - GaussianMixtureModel exposes instances of MultivariateGauss... Jan 19, 2015
@@ -37,8 +37,9 @@ import org.apache.spark.mllib.util.MLUtils
*/
class GaussianMixtureModel(
val weight: Array[Double],
val mu: Array[Vector],
val sigma: Array[Matrix]) extends Serializable {
val gaussian: Array[MultivariateGaussian]) extends Serializable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "weight" and "gaussian" be plural ("weights" and "gaussians")?

@jkbradley
Copy link
Member

@tgaloppo Other than the 1 comment, this looks good. Thanks!

… gaussians. Other sources modified accordingly.
@SparkQA
Copy link

SparkQA commented Jan 20, 2015

Test build #25786 has started for PR 4088 at commit 3ef6c7f.

  • This patch merges cleanly.

@tgaloppo
Copy link
Contributor Author

@jkbradley I considered making those plural for the initial commit. I guess I should have. Update has been made.

@SparkQA
Copy link

SparkQA commented Jan 20, 2015

Test build #25786 has finished for PR 4088 at commit 3ef6c7f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25786/
Test PASSed.

@jkbradley
Copy link
Member

@tgaloppo Thank you! LGTM

CC: @mengxr

@asfgit asfgit closed this in 23e2554 Jan 20, 2015
@mengxr
Copy link
Contributor

mengxr commented Jan 20, 2015

Merged into master. Thanks!

bomeng pushed a commit to Huawei-Spark/spark that referenced this pull request Jan 21, 2015
…ariateGauss...

This PR modifies GaussianMixtureModel to expose instances of MutlivariateGaussian rather than separate mean and covariance arrays.

Author: Travis Galoppo <tjg2107@columbia.edu>

Closes apache#4088 from tgaloppo/spark-5019 and squashes the following commits:

3ef6c7f [Travis Galoppo] In GaussianMixtureModel: Changed name of weight, gaussian to weights, gaussians.  Other sources modified accordingly.
091e8da [Travis Galoppo] SPARK-5019 - GaussianMixtureModel exposes instances of MultivariateGaussian rather than mean/covariance matrices
yaooqinn pushed a commit that referenced this pull request Aug 26, 2024
…42.7.4 and `mssql` to 12.8.1.jre11

### What changes were proposed in this pull request?

This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11.

### Why are the changes needed?

1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html):

    - [Issue #3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause
    - [Issue #4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230)
    - [Issue #3982](h2database/h2database#3982): Potential issue when using ROUND
    - [Issue #3894](h2database/h2database#3894): Race condition causing stale data in query last result cache
    - [Issue #4075](h2database/h2database#4075): infinite loop in compact
    - [Issue #4091](h2database/h2database#4091): Wrong case with linked table to postgresql
    - [Issue #4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs

2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/):

    - fix: PgInterval ignores case for represented interval string [PR #3344](pgjdbc/pgjdbc#3344)
    - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR #3295](pgjdbc/pgjdbc#3295)
    - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR #3304](pgjdbc/pgjdbc#3304)
    - fix: Ensure order of results for getDouble [PR #3301](pgjdbc/pgjdbc#3301)
    - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR #3248](pgjdbc/pgjdbc#3248)
    - fix: Fix SSL tests [PR #3260](pgjdbc/pgjdbc#3260)
    - fix: Support bytea in preferQueryMode=simple [PR #3243](pgjdbc/pgjdbc#3243)
    - fix: Fix [Issue #3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR #3235](pgjdbc/pgjdbc#3235)
    - fix: Fix [Issue #3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR #3225](pgjdbc/pgjdbc#3225)

3. For `mssql`,  there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1):

    - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR #2492](microsoft/mssql-jdbc#2492)
    - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR #2493](microsoft/mssql-jdbc#2493)
    - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR #2494](microsoft/mssql-jdbc#2494)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47810 from wayneguow/ug_h2.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
IvanK-db pushed a commit to IvanK-db/spark that referenced this pull request Sep 20, 2024
…42.7.4 and `mssql` to 12.8.1.jre11

### What changes were proposed in this pull request?

This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11.

### Why are the changes needed?

1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html):

    - [Issue apache#3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause
    - [Issue apache#4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230)
    - [Issue apache#3982](h2database/h2database#3982): Potential issue when using ROUND
    - [Issue apache#3894](h2database/h2database#3894): Race condition causing stale data in query last result cache
    - [Issue apache#4075](h2database/h2database#4075): infinite loop in compact
    - [Issue apache#4091](h2database/h2database#4091): Wrong case with linked table to postgresql
    - [Issue apache#4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs

2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/):

    - fix: PgInterval ignores case for represented interval string [PR apache#3344](pgjdbc/pgjdbc#3344)
    - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR apache#3295](pgjdbc/pgjdbc#3295)
    - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR apache#3304](pgjdbc/pgjdbc#3304)
    - fix: Ensure order of results for getDouble [PR apache#3301](pgjdbc/pgjdbc#3301)
    - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR apache#3248](pgjdbc/pgjdbc#3248)
    - fix: Fix SSL tests [PR apache#3260](pgjdbc/pgjdbc#3260)
    - fix: Support bytea in preferQueryMode=simple [PR apache#3243](pgjdbc/pgjdbc#3243)
    - fix: Fix [Issue apache#3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR apache#3235](pgjdbc/pgjdbc#3235)
    - fix: Fix [Issue apache#3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR apache#3225](pgjdbc/pgjdbc#3225)

3. For `mssql`,  there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1):

    - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR apache#2492](microsoft/mssql-jdbc#2492)
    - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR apache#2493](microsoft/mssql-jdbc#2493)
    - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR apache#2494](microsoft/mssql-jdbc#2494)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47810 from wayneguow/ug_h2.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
…42.7.4 and `mssql` to 12.8.1.jre11

### What changes were proposed in this pull request?

This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11.

### Why are the changes needed?

1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html):

    - [Issue apache#3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause
    - [Issue apache#4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230)
    - [Issue apache#3982](h2database/h2database#3982): Potential issue when using ROUND
    - [Issue apache#3894](h2database/h2database#3894): Race condition causing stale data in query last result cache
    - [Issue apache#4075](h2database/h2database#4075): infinite loop in compact
    - [Issue apache#4091](h2database/h2database#4091): Wrong case with linked table to postgresql
    - [Issue apache#4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs

2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/):

    - fix: PgInterval ignores case for represented interval string [PR apache#3344](pgjdbc/pgjdbc#3344)
    - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR apache#3295](pgjdbc/pgjdbc#3295)
    - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR apache#3304](pgjdbc/pgjdbc#3304)
    - fix: Ensure order of results for getDouble [PR apache#3301](pgjdbc/pgjdbc#3301)
    - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR apache#3248](pgjdbc/pgjdbc#3248)
    - fix: Fix SSL tests [PR apache#3260](pgjdbc/pgjdbc#3260)
    - fix: Support bytea in preferQueryMode=simple [PR apache#3243](pgjdbc/pgjdbc#3243)
    - fix: Fix [Issue apache#3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR apache#3235](pgjdbc/pgjdbc#3235)
    - fix: Fix [Issue apache#3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR apache#3225](pgjdbc/pgjdbc#3225)

3. For `mssql`,  there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1):

    - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR apache#2492](microsoft/mssql-jdbc#2492)
    - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR apache#2493](microsoft/mssql-jdbc#2493)
    - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR apache#2494](microsoft/mssql-jdbc#2494)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47810 from wayneguow/ug_h2.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…42.7.4 and `mssql` to 12.8.1.jre11

### What changes were proposed in this pull request?

This PR aims to upgrade `h2` to 2.3.232, `postgresql` to 42.7.4 and `mssql` to 12.8.1.jre11.

### Why are the changes needed?

1. For `h2`, there are some issues fixed in version 2.3.232(full release notes: https://www.h2database.com/html/changelog.html):

    - [Issue apache#3945](h2database/h2database#3945): Column not found in correlated subquery, when referencing outer column from LEFT JOIN .. ON clause
    - [Issue apache#4097](h2database/h2database#4097): StackOverflowException when using multiple SELECT statements in one query (2.3.230)
    - [Issue apache#3982](h2database/h2database#3982): Potential issue when using ROUND
    - [Issue apache#3894](h2database/h2database#3894): Race condition causing stale data in query last result cache
    - [Issue apache#4075](h2database/h2database#4075): infinite loop in compact
    - [Issue apache#4091](h2database/h2database#4091): Wrong case with linked table to postgresql
    - [Issue apache#4088](h2database/h2database#4088): BadGrammarException when the same alias is used within two different CTEs

2. For `postgresql`, there are some issues fixed and improvements in version 42.7.4(full release notes: https://jdbc.postgresql.org/changelogs/2024-08-22-42.7.4-release/):

    - fix: PgInterval ignores case for represented interval string [PR apache#3344](pgjdbc/pgjdbc#3344)
    - perf: Avoid extra copies when receiving int4 and int2 in PGStream [PR apache#3295](pgjdbc/pgjdbc#3295)
    - fix: Add support for Infinity::numeric values in ResultSet.getObject [PR apache#3304](pgjdbc/pgjdbc#3304)
    - fix: Ensure order of results for getDouble [PR apache#3301](pgjdbc/pgjdbc#3301)
    - perf: Replace BufferedOutputStream with unsynchronized PgBufferedOutputStream, allow configuring different Java and SO_SNDBUF buffer sizes [PR apache#3248](pgjdbc/pgjdbc#3248)
    - fix: Fix SSL tests [PR apache#3260](pgjdbc/pgjdbc#3260)
    - fix: Support bytea in preferQueryMode=simple [PR apache#3243](pgjdbc/pgjdbc#3243)
    - fix: Fix [Issue apache#3234](pgjdbc/pgjdbc#3234) - Return -1 as update count for stored procedure calls [PR apache#3235](pgjdbc/pgjdbc#3235)
    - fix: Fix [Issue apache#3224](pgjdbc/pgjdbc#3224) - conversion for TIME ‘24:00’ to LocalTime breaks in binary-mode [PR apache#3225](pgjdbc/pgjdbc#3225)

3. For `mssql`,  there are some issues fixed in 12.8.1.jre11(full release notes: https://github.com/microsoft/mssql-jdbc/releases/tag/v12.8.1):

    - Adjusted DESTINATION_COL_METADATA_LOCK, in SQLServerBulkCopy, so that is properly released in all cases [PR apache#2492](microsoft/mssql-jdbc#2492)
    - Reverted "Execute Stored Procedures Directly" feature, as well as subsequent changes related to the feature [PR apache#2493](microsoft/mssql-jdbc#2493)
    - Changed driver behavior to allow prepared statement objects to be reused, preventing a "multiple queries are not allowed" error [PR apache#2494](microsoft/mssql-jdbc#2494)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47810 from wayneguow/ug_h2.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants