Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added collectFirst to Chain and NonEmptyChain #2796

Merged
merged 3 commits into from
May 1, 2019

Conversation

LMnet
Copy link
Contributor

@LMnet LMnet commented Apr 17, 2019

Fixed #2795

@kailuowang
Copy link
Contributor

@LMnet thanks so much for contributing to Cats.
The build failed on scalafmt check? I guess you ran sbt prePR as suggested by the PR template but didn't commit the change or sbt prePR didn't catch it?
collectFirst for these two classes are provided in Foldable, although your implementation could be more slightly more efficient. Would you add an override method to the foldable instance here

override def filter[A](fa: Chain[A])(f: A => Boolean): Chain[A] = fa.filter(f)
override def collect[A, B](fa: Chain[A])(f: PartialFunction[A, B]): Chain[B] = fa.collect(f)
override def mapFilter[A, B](fa: Chain[A])(f: A => Option[B]): Chain[B] = fa.collect(Function.unlift(f))
override def flattenOption[A](fa: Chain[Option[A]]): Chain[A] = fa.collect { case Some(a) => a }
def traverseFilter[G[_], A, B](fa: Chain[A])(f: A => G[Option[B]])(implicit G: Applicative[G]): G[Chain[B]] =
?
Another thing that might be too much to ask:
If we override collectFirst in Foldable, shall we also override collectFirstSome?

…irst and collectFirstSome in the Foldable instances of Chain and NonEmptyChain.
@LMnet
Copy link
Contributor Author

LMnet commented Apr 18, 2019

@kailuowang Thanks for the fast review. I added collectFirstSome to both Chain and NonEmptyChain, and overrode new methods in the Foldable instances.

@codecov-io
Copy link

codecov-io commented Apr 18, 2019

Codecov Report

Merging #2796 into master will decrease coverage by 0.1%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2796      +/-   ##
==========================================
- Coverage   94.28%   94.17%   -0.11%     
==========================================
  Files         367      368       +1     
  Lines        6926     6933       +7     
  Branches      297      303       +6     
==========================================
- Hits         6530     6529       -1     
- Misses        396      404       +8
Impacted Files Coverage Δ
core/src/main/scala/cats/data/Chain.scala 99.62% <100%> (+0.02%) ⬆️
core/src/main/scala/cats/data/NonEmptyChain.scala 98.21% <100%> (+0.06%) ⬆️
...in/scala/cats/kernel/instances/StaticMethods.scala 73.8% <0%> (-19.05%) ⬇️
laws/src/main/scala/cats/laws/discipline/Eq.scala 33.33% <0%> (ø) ⬆️
testkit/src/main/scala/cats/tests/CatsSuite.scala 33.33% <0%> (ø) ⬆️
...ws/src/main/scala/cats/kernel/laws/OrderLaws.scala
...laws/src/main/scala/cats/kernel/laws/package.scala
.../main/scala/cats/kernel/laws/SemilatticeLaws.scala
...ernel/laws/discipline/CommutativeMonoidTests.scala
...in/scala/cats/kernel/laws/discipline/EqTests.scala
... and 57 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 65bee09...5f5f63e. Read the comment docs.

* instead of `PartialFunction`s.
*/
final def collectFirstSome[B](f: A => Option[B]): Option[B] =
collectFirst(Function.unlift(f))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reimplement using foreachUntil? The whole purpose of overriding the Foldable one is the performance gain, so I'd rather be sure that it reduces overhead to the minimum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that this will give some noticeable performance improvement. I will try to benchmark this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I am curious to see the benchmark foreachUntil implementation of collectFrist v.s. the vanilla implementation from Foldable using foldRight

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally find time for the benchmark.

I made a separate branch for benchmarking. Here is the commit with the code of the benchmark.

I removed overrides from the Foldable instance to compare the performance of the default implementation of collectFirst/collectFirstSome from the Foldable. Also, I added collectFirstSome2 with the specialized version of collectFirstSome, based on foreachUntil and without Function.unlift.

I benchmarked small (size 5), medium (size 100) and large (size 1000000) chains.

I ran benchmarks with the following command: jmh:run -i 10 -wi 10 -f1 -t1 .*ChainCollectFirst.*

Benchmarks

Benchmark                                               Mode  Cnt         Score       Error  Units
ChainCollectFirstBench.collectFirstSmallFoldable       thrpt   10   7299195.442 ± 55930.435  ops/s
ChainCollectFirstBench.collectFirstSmallNew            thrpt   10  19402233.046 ± 77203.919  ops/s

ChainCollectFirstBench.collectFirstMediumFoldable      thrpt   10    349714.002 ±  1802.971  ops/s
ChainCollectFirstBench.collectFirstMediumNew           thrpt   10   1065836.106 ± 11337.181  ops/s

ChainCollectFirstBench.collectFirstLargeFoldable       thrpt   10    377742.008 ±  1213.396  ops/s
ChainCollectFirstBench.collectFirstLargeNew            thrpt   10   1041808.061 ±  8960.744  ops/s

ChainCollectFirstBench.collectFirstSomeSmallFoldable   thrpt   10   7260583.162 ± 66323.933  ops/s
ChainCollectFirstBench.collectFirstSomeSmallNew        thrpt   10  15223500.005 ± 70156.918  ops/s
ChainCollectFirstBench.collectFirstSome2SmallNew       thrpt   10  23795542.657 ± 53002.514  ops/s

ChainCollectFirstBench.collectFirstSomeMediumFoldable  thrpt   10    376093.351 ±  2306.020  ops/s
ChainCollectFirstBench.collectFirstSomeMediumNew       thrpt   10   1250584.212 ±  8875.084  ops/s
ChainCollectFirstBench.collectFirstSome2MediumNew      thrpt   10   1412017.858 ± 13673.273  ops/s

ChainCollectFirstBench.collectFirstSomeLargeFoldable   thrpt   10    392443.540 ±  2350.104  ops/s
ChainCollectFirstBench.collectFirstSomeLargeNew        thrpt   10   1243549.960 ± 13598.336  ops/s
ChainCollectFirstBench.collectFirstSome2LargeNew       thrpt   10   1399553.286 ±  7312.677  ops/s
  • *Foldable is the default implementation from Foldable.
  • *New is new implementation from my PR
  • collectFirstSome2*New is collectFirstSome based on foreachUntil and without Function.unlift

Results

  1. Implementations of collectFirst/collectFirstSome from my PR is few times faster than defaults for any Chain size.
  2. collectFirstSome based on collectFirst and Function.unlift is ~2 times slower than collectFirstSome based on foreachUntil for the small chains.
  3. collectFirstSome based on collectFirst and Function.unlift has almost same performance as collectFirstSome based on foreachUntil for the medium and large chains.

Conclusion

I changed collectFirstSome implementation to foreachUntil. I added a new commit to this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! thanks so much for this contribution.

Copy link
Member

@LukaJCB LukaJCB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR, including benchmarks, really valuable! Thanks so much :)

@kailuowang kailuowang added this to the 2.0.0-RC1 milestone May 1, 2019
@kailuowang kailuowang merged commit 422c989 into typelevel:master May 1, 2019
@LMnet LMnet deleted the collectFirst branch May 1, 2019 20:12
@kailuowang kailuowang modified the milestones: 2.0.0-RC1, 2.0.0-M2 Jun 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing collectFirst on Chain and NonEmptyChain
4 participants