Skip to content

[Spark-13784][ML][WIP] Model export/import for spark.ml: RandomForests #12023

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

GayathriMurali
Copy link
Contributor

Please help review the code. I have the WIP included to make sure the changes look correct.

What changes were proposed in this pull request?

Model export/import for spark.ml RandomForests

@GayathriMurali
Copy link
Contributor Author

@yanboliang @jkbradley Please help review the code.

@GayathriMurali GayathriMurali changed the title [Spark 13784][ML][WIP] Model export/import for spark.ml: RandomForests [Spark-13784][ML][WIP] Model export/import for spark.ml: RandomForests Mar 29, 2016
@jkbradley
Copy link
Member

ok to test

@@ -136,7 +144,8 @@ object RandomForestClassifier {
* [[http://en.wikipedia.org/wiki/Random_forest Random Forest]] model for classification.
* It supports both binary and multiclass labels, as well as both continuous and categorical
* features.
* @param _trees Decision trees in the ensemble.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update your IntelliJ settings: Editor -> Code Style -> Scala -> ScalaDoc tab -> uncheck "Use scaladoc indent for leading asterisk."

You will probably need to manually correct these indentation changes for this PR, though.

@jkbradley
Copy link
Member

I made an initial pass with high-level comments. It will be important to reuse more code. I'll stay tuned for updates. Thanks!

@jkbradley
Copy link
Member

add to whitelist

@SparkQA
Copy link

SparkQA commented Mar 29, 2016

Test build #2708 has finished for PR 12023 at commit 68b9358.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

@GayathriMurali Will you be able to update this soon? If not, I'd be happy to help out.

@GayathriMurali
Copy link
Contributor Author

@jkbradley I should be able to update this by tonight. Would that work?

@jkbradley
Copy link
Member

Yes, that sounds fine. Thanks!

@GayathriMurali
Copy link
Contributor Author

@jkbradley I am sorry, I am afraid I will not be able to complete tonight. Can you please help me with reusing SplitData/build code from DecisionTrees in RandomForests? I am not sure how to reuse case class defined within an object in another object.

@jkbradley
Copy link
Member

Sure, I'll send a PR to update this PR.

@jkbradley
Copy link
Member

@GayathriMurali This turns out to be messier than I would have expected because of the fact that the RF Estimators and Models do not share the same Params. I'd like to take over this PR if that's OK. I'll use your commits too, so you'll be an author on it.

I can send a PR soon. Would you mind helping to review the PR once I send it? Thanks a lot for your understanding.

@GayathriMurali
Copy link
Contributor Author

@jkbradley I was just about to ping you regarding this. I would definitely love to help out. I was out at Strata all week and couldn't get to this. Please let me know if you need anything else from me.

@jkbradley
Copy link
Member

@GayathriMurali Thanks a lot. I just opened [https://github.com//pull/12118]
Would you mind closing this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants