Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stack safe traverse #3517

Closed
johnynek opened this issue Jul 12, 2020 · 4 comments · Fixed by #3521
Closed

stack safe traverse #3517

johnynek opened this issue Jul 12, 2020 · 4 comments · Fixed by #3521

Comments

@johnynek
Copy link
Contributor

I have thought a lot about stack safety. For Monad/FlatMap we have tailRecM, which seems to solve the problem and seems to be implementable by virtually all monads.

For Traverse, the story is harder: we are interacting two typeclasses: Traverse and Applicative. We can't specialize the operation on both (not easily at least). Generally, we have to write traverse we no assumption about the Applicative.

I noticed a while back in #2680 that TreeList with a tree structure does not blow the stack and wondered if we should add it here.

Recently I sent this PR to fs2:
typelevel/fs2#1952

which makes traverse on Chunk stack safe even for really hostile Applicative[F] such as Function0.

This approach could be copied here for Vector, List, Map, etc.. The idea is you linearize the collection, apply map/map2 on the tree and combine with Chain (which has O(1) concat), then finally you convert the Chain back to the collection in question in a final map.

There was no perf hit in the example I gave, but there is a trade-off. Some implementations of traverse can short circuit more effectively (especially for error-like F[_]s, things with empty values/zeros: Option/List/Either/Try/IO etc...).

Now, the final result is equivalent (due to the laws on Applicative), but the number of calls to the function passed in may not be the same.

@johnynek
Copy link
Contributor Author

It occurs to me, in the above PR I made a binary tree, but the approach works just as well with any number of children > 1. So, we could have say 16 or 32 children for each node and it will probably be faster:

we make blocks of 32, we do linear map2 on each of the items in that list, and then we recombine at the next layer up. This is still O(N log N) but the base is much larger so that in practice it may be faster to execute. This seems like it will likely be the best of both worlds.

Since we can actually handle pretty deep stacks, one can imagine using chains of say 128 or even 256.

@johnynek
Copy link
Contributor Author

I implemented the improved idea in fs2 with a fan out of 128 and it gave a 2x lift in performance for smaller traversals over the standard foldRight + vector based approach, and it also scales up to giant sizes without throwing (unlike some cases with foldRight).

I think this should be considered as the new standard.

In the cases where you can't access items in the F[_] in O(1) (unlike chunk), you should do a pass building up a Buffer[G[B]] (or similar approach to enable the recursion).

@johnynek
Copy link
Contributor Author

okay, I opened #3521 which implements this for all the cases I saw in cats.

@tnielens
Copy link
Contributor

Hello @johnynek,

I'm thinking about another approach to this issue.
As you describe it, tailRecM takes care of stack safety for monadic recursion. What about Applicative looping/iterating.
Can't we leverage map2Eval and the stack-safety of Eval here?

Here is an example, we can "stage" the function composition evaluation of Eval[Function0].map2Eval if we overwrite it like this:

      override def map2Eval[A, B, Z](fa: () => A, efb: Eval[() => B])(f: (A, B) => Z): Eval[() => Z] =
        Apply[Eval].map2(Eval.later(fa()), efb.flatMap(fb => Eval.later(fb())))((a, b) => () => f(a, b))

With this overwrite, a simple definition of traverse for List passes your new stacksafety test

      def traverse[G[_], A, B](fa: List[A])(f: A => G[B])(implicit G: Applicative[G]): G[List[B]] = {
        @tailrec
        def loop(la: List[A], gAcc: G[List[B]]): G[List[B]] = {
          la match {
            case head :: next => loop(next, G.map2Eval(gAcc, Eval.later(f(head)))((acc, a) => a :: acc).value)
            case Nil => gAcc
          }
        }
        G.map(loop(fa, G.pure(Nil)))(_.reverse)
      }

This rule could be encoded as a test/law on the Applicative typeclass. map2Eval must be stacksafe and not overflow on an iteration over a collection of X elements.

I hope this makes sense. Let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants