Skip to content

[SPARK-2550][MLLIB][APACHE SPARK] Support regularization and intercept in pyspark's linear methods #1775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

miccagiann
Copy link
Contributor

Related to Jira Issue: SPARK-2550

@miccagiann
Copy link
Contributor Author

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA tests have started for PR 1775. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17899/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA tests have started for PR 1775. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17900/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA results for PR 1775:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17899/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA results for PR 1775:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17900/consoleFull

@miccagiann
Copy link
Contributor Author

Jenkins complains about a Hive Unit test that has not been passed:

SET commands semantics for a HiveContext *** FAILED ***
[info] Expected Array("spark.sql.key.usedfortestonly=test.val.0", "spark.sql.key.usedfortestonlyspark.sql.key.usedfortestonly=test.val.0test.val.0"), but got Array("spark.sql.key.usedfortestonlyspark.sql.key.usedfortestonly=test.val.0test.val.0", "spark.sql.key.usedfortestonly=test.val.0") (HiveQuerySuite.scala:473)

@miccagiann
Copy link
Contributor Author

All the tests related to MLLIB have been passed (the first time that Jenkins ran all the tests). Is the aforementioned Hive Unit test related to this patch?

What is more, the second time that Jenkins ran the tests, there are some errors from streaming-flume/test and from mllib/test. Digging into more on the output of these tests we have the following exception:
Cause: java.net.BindException: Address already in use

@miccagiann
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA tests have started for PR 1775. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17912/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA results for PR 1775:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17912/consoleFull

@miccagiann
Copy link
Contributor Author

Found the error. It was a typo. Let's see what Jenkins is going to say...

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA tests have started for PR 1775. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17918/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA results for PR 1775:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17918/consoleFull

SVMAlg.optimizer
.setNumIterations(numIterations)
.setRegParam(regParam)
.setStepSize(stepSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to set miniBatchFraction here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I am fixing it right now!

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA tests have started for PR 1775. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17924/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA results for PR 1775:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17924/consoleFull

@mengxr
Copy link
Contributor

mengxr commented Aug 5, 2014

LGTM. Merged into both master and branch-1.1. Thanks!!

asfgit pushed a commit that referenced this pull request Aug 6, 2014
…t in pyspark's linear methods

Related to Jira Issue: [SPARK-2550](https://issues.apache.org/jira/browse/SPARK-2550?jql=project%20%3D%20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20priority%20%3D%20Major%20ORDER%20BY%20key%20DESC)

Author: Michael Giannakopoulos <miccagiann@gmail.com>

Closes #1775 from miccagiann/linearMethodsReg and squashes the following commits:

cb774c3 [Michael Giannakopoulos] MiniBatchFraction added in related PythonMLLibAPI java stubs.
81fcbc6 [Michael Giannakopoulos] Fixing a typo-error.
8ad263e [Michael Giannakopoulos] Adding regularizer type and intercept parameters to LogisticRegressionWithSGD and SVMWithSGD.

(cherry picked from commit 1aad911)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@asfgit asfgit closed this in 1aad911 Aug 6, 2014
@miccagiann miccagiann deleted the linearMethodsReg branch August 25, 2014 16:27
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…t in pyspark's linear methods

Related to Jira Issue: [SPARK-2550](https://issues.apache.org/jira/browse/SPARK-2550?jql=project%20%3D%20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20priority%20%3D%20Major%20ORDER%20BY%20key%20DESC)

Author: Michael Giannakopoulos <miccagiann@gmail.com>

Closes apache#1775 from miccagiann/linearMethodsReg and squashes the following commits:

cb774c3 [Michael Giannakopoulos] MiniBatchFraction added in related PythonMLLibAPI java stubs.
81fcbc6 [Michael Giannakopoulos] Fixing a typo-error.
8ad263e [Michael Giannakopoulos] Adding regularizer type and intercept parameters to LogisticRegressionWithSGD and SVMWithSGD.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants