-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-2514] [mllib] Random RDD generator #1520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looking for feedback on design decisions. Very rough draft and untested.
|
QA tests have started for PR 1520. This patch merges cleanly. |
|
QA results for PR 1520: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change @return to Returns. Otherwise the summary will be empty in the generated docs.
|
QA tests have started for PR 1520. This patch merges cleanly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.i.d -> i.i.d. and in other places
|
@dorx Besides comments, could you mark distribution generators and methods that requires distribution generators |
|
QA tests have started for PR 1520. This patch merges cleanly. |
|
QA results for PR 1520: |
|
Jenkins, retest this please. |
|
QA tests have started for PR 1520. This patch merges cleanly. |
|
QA results for PR 1520: |
|
LGTM. Merged into master. Thanks for adding random RDD generators!! |
Utilities for generating random RDDs. RandomRDD and RandomVectorRDD are created instead of using `sc.parallelize(range:Range)` because `Range` objects in Scala can only have `size <= Int.MaxValue`. The object `RandomRDDGenerators` can be transformed into a generator class to reduce the number of auxiliary methods for optional arguments. Author: Doris Xin <doris.s.xin@gmail.com> Closes apache#1520 from dorx/randomRDD and squashes the following commits: 01121ac [Doris Xin] reviewer comments 6bf27d8 [Doris Xin] Merge branch 'master' into randomRDD a8ea92d [Doris Xin] Reviewer comments 063ea0b [Doris Xin] Merge branch 'master' into randomRDD aec68eb [Doris Xin] newline bc90234 [Doris Xin] units passed. d56cacb [Doris Xin] impl with RandomRDD 92d6f1c [Doris Xin] solution for Cloneable df5bcff [Doris Xin] Merge branch 'generator' into randomRDD f46d928 [Doris Xin] WIP 49ed20d [Doris Xin] alternative poisson distribution generator 7cb0e40 [Doris Xin] fix for data inconsistency 8881444 [Doris Xin] RandomRDDGenerator: initial design
Utilities for generating random RDDs.
RandomRDD and RandomVectorRDD are created instead of using
sc.parallelize(range:Range)becauseRangeobjects in Scala can only havesize <= Int.MaxValue.The object
RandomRDDGeneratorscan be transformed into a generator class to reduce the number of auxiliary methods for optional arguments.