Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove casts from Eval, fix stack overflow in Eval #3519

Merged
merged 5 commits into from
Jul 16, 2020

Conversation

johnynek
Copy link
Contributor

similar to #3518 we use the 2.12 ability to do tailrec when the generic parameter changes to remove the casts.

Instead of using List[Any => Eval[Any]] we have to introduce a new type-aligned stack of functions, but I make this private.

This simplification leads to one optimization: if we have a finished Memoize in FlatMap we don't need to loop(Now... we can directly apply the flatMap function.

Also, I noticed perhaps the source of stack overflows I've seen in the past: #2587

Copy link
Contributor Author

@johnynek johnynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@non since you wrote this code long ago, I'd appreciate your input.

}
case xx =>
// note: xx could be a Defer so calling .value
// could blow the stack?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks unsafe to me. I don't see why we don't need to check for Defer here. I think this could be the cause of #2587

If this is safe, I think we need a comment as to why xx.value is okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@non do you know why we wouldn't do loop(xx, Many(c.run, fs)) when we have a Defer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so defer(e).flatMap(f) is safe because .flatMap checks for Defer and uses the thunk as start, BUT defer(defer(e)).flatMap(f) isn't safe because flatMap only unwraps one layer of Defer and then we wind up calling .value on a defer which blows the stack.

I reproed this failure mode in a test.

Should I combine with this code for the fix or make a follow up PR?

case f :: fs => loop(f(x.value), fs)
case Nil => x.value
case Many(f, fs) => loop(f(x.value), fs)
case Ident(ev) => ev(x.value)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the calls to x.value here are safe because Now, Later and Always don't do any recursion on Eval.

}
}

loop(e.asInstanceOf[L], Nil).asInstanceOf[A]
loop(e, Ident(implicitly[A <:< A]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we have 1 allocation more than we did before: Ident where as before we did Nil. But since this is one allocation per call to .value, I think this is well worth the cost considering how many allocations we already make (1 per depth at least to build up the user level stack).

case mm @ Memoize(eval) =>
mm.result match {
case Some(a) =>
loop(Now(a), c.run.asInstanceOf[C] :: fs)
loop(nextFs.first(a), nextFs.rest)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is an optimization: since we know we already have a Many, we don't need to recurse.

@codecov-commenter
Copy link

codecov-commenter commented Jul 12, 2020

Codecov Report

Merging #3519 into master will decrease coverage by 0.04%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##           master    #3519      +/-   ##
==========================================
- Coverage   91.54%   91.49%   -0.05%     
==========================================
  Files         386      386              
  Lines        8476     8915     +439     
  Branches      245      285      +40     
==========================================
+ Hits         7759     8157     +398     
- Misses        717      758      +41     

@johnynek
Copy link
Contributor Author

Here are the benchmarks before and after:

benchmarks 2.12:                                                                  
                                                
master branch:                                                          
[info] Benchmark                    Mode  Cnt      Score      Error  Units    
[info] TrampolineBench.eval        thrpt    5  15266.439 ± 1106.948  ops/s    
                
this PR:                            
[info] Benchmark              Mode  Cnt      Score      Error  Units    
[info] TrampolineBench.eval  thrpt    5  14891.105 ± 4289.588  ops/s  

@johnynek
Copy link
Contributor Author

I went ahead and added the fix for stack safety here.

BTW: this, IMO, shows the win of removing the casts. Without the casts, it was clearer to me to follow the logic and it was easier to see what was going on.

@johnynek johnynek changed the title remove casts from Eval remove casts from Eval, fix stack overflow in Eval Jul 12, 2020
@johnynek
Copy link
Contributor Author

fixes #2587

@@ -263,26 +263,13 @@ object Eval extends EvalInstances {
* of FlatMap nodes) or "value" (in the case of Now, Later, or
* Always nodes).
*/
@tailrec private def advance[A](fa: Eval[A]): Eval[A] =
fa match {
@tailrec private def advance[A](fa: Defer[A]): Eval[A] =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a private method, and we only need to call it on Defer[A] to return a non-Defer result. This simplifies the code since I don't see any reason we need to deal with FlatMap as we did.

new Eval.FlatMap[A] {
type Start = compute.Start
val start: () => Eval[Start] = () => compute.start()
val run: Start => Eval[A] = s => advance1(compute.run(s))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@non can you see why we did this? Can you write a test that shows the need? Removing it is a bit more efficient since we don't add a level of stack to all FlatMaps we call, BUT note, we should only really need to call advance when we know it is a Defer (note I change the argument type).

case cc: FlatMap[c.Start] =>
val nextFs = Many(c.run, fs)
loop(cc.start(), Many(cc.run, nextFs))
case call: Defer[c.Start] =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this branch needs to be added. Note I added a test to the PR the exercises this. Without this change we blow the stack.

@johnynek
Copy link
Contributor Author

@non can you remind us here why we have Defer at all? Why not just:

def defer[A](ea: => Eval[A]): Eval[A] =
  Unit.flatMap(_ => ea)

all the tests seem to pass and it would, I assume, improve performance of code using Defer since we have fewer branches to test.

Also, if we did this, in FlatMap we could have val start: Eval[Start] rather than val start: () => Eval[Start] so we avoid having to invoke another Function0 there.

We probably can't remove Defer as a class, but we can avoid using it and maybe the branch predictor would pick up on that fact. But our current benchmarks don't exercise defer so we'd have to spend effort on benchmarking to make this case.

* so calling .value does not trigger
* any flatMaps or defers
*/
sealed abstract class Leaf[A] extends Eval[A]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this, we can use x: Leaf[A] below to make it clear calling .value is safe. This helps the reader review the def evaluate method and see that it is safe (we only call .value on Leafs).

it doesn't impact the performance (hopefully scala removes the type check when it can statically prove that it would always pass, and since this was in the last position of the match that is the case currently).

@johnynek
Copy link
Contributor Author

@travisbrown would you have time to look at this one? I know you have a lot on your plate.

@travisbrown
Copy link
Contributor

@johnynek Sure, I'll take a look today.

c.start() match {
case cc: FlatMap[_] =>
loop(cc.start().asInstanceOf[L], cc.run.asInstanceOf[C] :: c.run.asInstanceOf[C] :: fs)
case cc: FlatMap[c.Start] =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure why we don't do:

case c: FlatMap[A1] =>
  loop(c.start(), Many(c.run, fs))

and let the rest of the code run.

That would be correct, but I assume @non inlined this because for a FlatMap we know that we have at least one item on the stack and if we do my suggestion above, we allocate a Many just to pattern match on it in many cases and discard it. So, by nesting the pattern match on the c.start() we often can often sometimes avoid the allocation and pattern match.

If @non has time to chime in, it would be appreciated, I could update a comment to make it clearer.

Copy link
Contributor

@travisbrown travisbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like the Leaf thing. I'm not convinced that it's a good idea to add a type to the public API just to make the safety of some internal code a little clearer (especially since that code is still pretty complicated). It also means that some matches that used to have a reasonable catch-all case now don't (although I guess it's unlikely that someone could add something to the Eval hierarchy without immediately running into MatchErrors if they mess this up).

I'm also not sure I see all that much value in replacing the old List[Any => Eval[Any]] approach with a proper type-aligned list, but I guess that since it's turned up bugs / optimizations I'm probably wrong about that.

👍 from me on this. We should publish a 2.2.0-RC2 once it's merged.

@johnynek
Copy link
Contributor Author

If I were designing Eval today I would make all subclasses private. It's a bit strange that the comments say that the details of Eval are opaque yet all the subclasses are public.

That said Leaf is somewhat useful to users: it is a case where calling .value is safe at any location.

For instance if you were doing a loop where you will call .value at the end you could conceivably match on Leaf and call .value at the beginning, e.g. some map2Eval implementations could possibly make use of it.

Note, they could have already matched on Now, Later or Always which are public but this unifies those three into a single subtype.

For me, I saw that there were some internal .value calls but I wasn't sure why that was safe: now you can see very clearly. I think that's a win. I even thought of giving Leaf a safeValue method to make it more clear that we only call that inside the evaluate loop. But I ultimately didn't.

Anyway, thanks for being flexible and accepting a design you didn't think was your first choice.

@johnynek johnynek merged commit 4eab3d6 into master Jul 16, 2020
@travisbrown travisbrown added this to the 2.2.0-RC2 milestone Jul 17, 2020
@larsrh larsrh deleted the oscar/remove_eval_casts branch September 19, 2020 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants