Skip to content

[SPARK-6518][MLlib][Example][DOC] Add example code and user guide for bisecting k-means #9952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

yu-iskw
Copy link
Contributor

@yu-iskw yu-iskw commented Nov 25, 2015

This PR includes only an example code in order to finish it quickly.
I'll send another PR for the docs soon.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Nov 25, 2015

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Nov 25, 2015

Test build #46657 has finished for PR 9952 at commit 7c8005d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Params(\n

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Nov 25, 2015

@jkbradley could you review this PR?

@yu-iskw yu-iskw changed the title [SPARK-6518][MLlib] Add example code and user guide for bisecting k-means [SPARK-6518][MLlib][Example] Add example code and user guide for bisecting k-means Nov 25, 2015
@jkbradley
Copy link
Member

When you modify this and [https://github.com//pull/9968] to use the include_example functionality, it may make sense to change this example code to use fixed parameters, rather than making it a command-line tool.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Nov 27, 2015

All right. I'll fix it.

@SparkQA
Copy link

SparkQA commented Nov 27, 2015

Test build #46790 has finished for PR 9952 at commit 86f6085.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Nov 27, 2015

@jkbradley could you review it?

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Dec 2, 2015

I moved the documentation to this PR from #9968. Because the PR depends on this PR.

I modified the docs for BisectingKMeans with include_example.

@SparkQA
Copy link

SparkQA commented Dec 3, 2015

Test build #47095 has finished for PR 9952 at commit 2f4c2a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Dec 4, 2015

@jkbradley ping

@jkbradley
Copy link
Member

Thanks! I'll review today

@jkbradley
Copy link
Member

I'll go ahead and review this, but if you have time to add a Java example, that would be awesome--thanks!


Bisecting k-means is a kind of [hierarchical clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering).
Hierarchical clustering is one of the most commonly used method of cluster analysis which seeks to build a hierarchy of clusters.
Strategies for hierarchical clustering generally fall into two types:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this line (which is getting into details), it would be good to give a high-level description of the use of bisecting k-means. E.g., "Bisecting K-means can often be much faster than regular K-means, but it will generally produce a different clustering."

@jkbradley
Copy link
Member

That should be it.

@jkbradley
Copy link
Member

Also, can you please add the "[DOC]" tag to the PR title? Thanks!

@yu-iskw yu-iskw changed the title [SPARK-6518][MLlib][Example] Add example code and user guide for bisecting k-means [SPARK-6518][MLlib][Example][DOC] Add example code and user guide for bisecting k-means Dec 7, 2015
@yu-iskw
Copy link
Contributor Author

yu-iskw commented Dec 7, 2015

@jkbradley sure, done.

@jkbradley
Copy link
Member

Thanks, I'll check back for updates for the few remaining comments.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Dec 10, 2015

ping @jkbradley

@jkbradley
Copy link
Member

@yu-iskw There are several comments of mine (above) which have not yet been addressed. Could you please update this PR based on those comments? I'll then take another look.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Dec 14, 2015

Oh, I'm terribly sorry about that. Pushing the update was failed....

  • Modified what you pointed out
  • Add a Java example and its doc

@SparkQA
Copy link

SparkQA commented Dec 14, 2015

Test build #47639 has finished for PR 9952 at commit 7c36c1d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * public class JavaBisectingKMeansExample\n


import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would exclude JavaSparkContext and SparkConf.

@jkbradley
Copy link
Member

No problem. I just added some comments and am generating the docs now.

@jkbradley
Copy link
Member

The generated docs look good, so just those few comments.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Dec 16, 2015

@jkbradley thanks for the review. I modified the wrong indentations and excluded the import statements for SparkConf and JavaSparkContext.

@SparkQA
Copy link

SparkQA commented Dec 16, 2015

Test build #47780 has finished for PR 9952 at commit 3066696.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * public class JavaBisectingKMeansExample\n

@jkbradley
Copy link
Member

LGTM, merging with master and branch-1.6
Thanks!

asfgit pushed a commit that referenced this pull request Dec 16, 2015
… bisecting k-means

This PR includes only an example code in order to finish it quickly.
I'll send another PR for the docs soon.

Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>

Closes #9952 from yu-iskw/SPARK-6518.

(cherry picked from commit 7b6dc29)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
@asfgit asfgit closed this in 7b6dc29 Dec 16, 2015
@yu-iskw
Copy link
Contributor Author

yu-iskw commented Dec 16, 2015

Thank you for merging it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants