-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutable MomentsState, more efficient + for double added to Moments #1050
Conversation
* Returns a [[Fold]] instance that uses `+` to accumulate deltas into this | ||
* [[Moments]] instance. | ||
*/ | ||
def fold: Fold[Double, Moments] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the style we used in ExpHist
, where the fold is an instance method that lets you fold into THIS Moments
, implemented by
_ => Moments.MomentsState.fromMoments(this)
Let me know if you want to just have an empty fold on Moments
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh... looking at it now, it seems a bit strange to me, but I can see the benefit of being able to start with a non empty value to resume some updates.
val delta_n = delta / countCombined | ||
val delta_n2 = delta_n * delta_n | ||
val delta_n3 = delta_n2 * delta_n | ||
val count_sq = count * count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also killed the math.pow
calls everywhere.
@@ -111,6 +245,8 @@ object Moments { | |||
|
|||
val aggregator: MomentsAggregator.type = MomentsAggregator | |||
|
|||
val fold: Fold[Double, Moments] = momentsMonoid.zero.fold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we go, the zero fold.
val iter = items.toIterator | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this implementation lives in MomentsState
now.
@@ -100,7 +132,7 @@ class MomentsTest extends AnyWordSpec with Matchers { | |||
* Given a list of doubles, create a Moments object to hold the list's central moments. | |||
*/ | |||
def getMoments(xs: List[Double]): Moments = | |||
MomentsAggregator(xs) | |||
Moments.aggregator(xs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched this so we get coverage on the aggregator
line :)
@@ -69,7 +69,7 @@ val sharedSettings = Seq( | |||
Nil | |||
} | |||
}, | |||
javacOptions ++= Seq("-target", "1.6", "-source", "1.6"), | |||
javacOptions ++= Seq("-target", "1.8", "-source", "1.8"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is the build change.
Codecov Report
@@ Coverage Diff @@
## develop #1050 +/- ##
========================================
Coverage 89.71% 89.71%
========================================
Files 124 124
Lines 9914 9957 +43
Branches 497 501 +4
========================================
+ Hits 8894 8933 +39
- Misses 1020 1024 +4
Continue to review full report at Codecov.
|
algebird-core/src/main/scala/com/twitter/algebird/MomentsGroup.scala
Outdated
Show resolved
Hide resolved
* Returns a [[Fold]] instance that uses `+` to accumulate deltas into this | ||
* [[Moments]] instance. | ||
*/ | ||
def fold: Fold[Double, Moments] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh... looking at it now, it seems a bit strange to me, but I can see the benefit of being able to start with a non empty value to resume some updates.
* but we do it to avoid allocating a new Moments on every item in the | ||
* loop. the Monoid laws test that sum matches looping on plus | ||
*/ | ||
val countCombined = count + b.m0D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could use a variant of Kahan summation on countCombined, right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, and if we had the fold version, or KahanState
, it would fit nicely here.
algebird-core/src/main/scala/com/twitter/algebird/MomentsGroup.scala
Outdated
Show resolved
Hide resolved
algebird-test/src/test/scala/com/twitter/algebird/MomentsLaws.scala
Outdated
Show resolved
Hide resolved
Okay, fixed up @johnynek ! |
Benchmark results... I'm a little surprised at the winner here.
Wow, and here are the benchmarks that could run BEFORE this PR and #1049. Optimizing
|
This PR:
build.sbt
preventing my local jdk17 from compiling+
methods toMoments
allowing addition with otherMoments
or doublesMomentsState
for efficient mutable accumulation of aMoments
instance from iterables of either doubles or other moments instancesMomentsState
pulls the former mutable implementation fromsumOption
and adds a second one for mutably adding in new doubles.