Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added collectFirst to Chain and NonEmptyChain #2796

Merged
merged 3 commits into from
May 1, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Added collectFirstSome to Chain and NonEmptyChain. Overrided collectF…
…irst and collectFirstSome in the Foldable instances of Chain and NonEmptyChain.
  • Loading branch information
LMnet committed Apr 18, 2019
commit e78385df3454a13b424332b503150953bb442cdc
15 changes: 12 additions & 3 deletions core/src/main/scala/cats/data/Chain.scala
Original file line number Diff line number Diff line change
Expand Up @@ -146,9 +146,9 @@ sealed abstract class Chain[+A] {
}

/**
* Finds the first element of this `Chain` for which the given partial
* function is defined, and applies the partial function to it.
*/
* Finds the first element of this `Chain` for which the given partial
* function is defined, and applies the partial function to it.
*/
final def collectFirst[B](pf: PartialFunction[A, B]): Option[B] = {
var result: Option[B] = None
foreachUntil { a =>
Expand All @@ -162,6 +162,13 @@ sealed abstract class Chain[+A] {
result
}

/**
* Like `collectFirst` from `scala.collection.Traversable` but takes `A => Option[B]`
* instead of `PartialFunction`s.
*/
final def collectFirstSome[B](f: A => Option[B]): Option[B] =
collectFirst(Function.unlift(f))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reimplement using foreachUntil? The whole purpose of overriding the Foldable one is the performance gain, so I'd rather be sure that it reduces overhead to the minimum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that this will give some noticeable performance improvement. I will try to benchmark this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I am curious to see the benchmark foreachUntil implementation of collectFrist v.s. the vanilla implementation from Foldable using foldRight

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally find time for the benchmark.

I made a separate branch for benchmarking. Here is the commit with the code of the benchmark.

I removed overrides from the Foldable instance to compare the performance of the default implementation of collectFirst/collectFirstSome from the Foldable. Also, I added collectFirstSome2 with the specialized version of collectFirstSome, based on foreachUntil and without Function.unlift.

I benchmarked small (size 5), medium (size 100) and large (size 1000000) chains.

I ran benchmarks with the following command: jmh:run -i 10 -wi 10 -f1 -t1 .*ChainCollectFirst.*

Benchmarks

Benchmark                                               Mode  Cnt         Score       Error  Units
ChainCollectFirstBench.collectFirstSmallFoldable       thrpt   10   7299195.442 ± 55930.435  ops/s
ChainCollectFirstBench.collectFirstSmallNew            thrpt   10  19402233.046 ± 77203.919  ops/s

ChainCollectFirstBench.collectFirstMediumFoldable      thrpt   10    349714.002 ±  1802.971  ops/s
ChainCollectFirstBench.collectFirstMediumNew           thrpt   10   1065836.106 ± 11337.181  ops/s

ChainCollectFirstBench.collectFirstLargeFoldable       thrpt   10    377742.008 ±  1213.396  ops/s
ChainCollectFirstBench.collectFirstLargeNew            thrpt   10   1041808.061 ±  8960.744  ops/s

ChainCollectFirstBench.collectFirstSomeSmallFoldable   thrpt   10   7260583.162 ± 66323.933  ops/s
ChainCollectFirstBench.collectFirstSomeSmallNew        thrpt   10  15223500.005 ± 70156.918  ops/s
ChainCollectFirstBench.collectFirstSome2SmallNew       thrpt   10  23795542.657 ± 53002.514  ops/s

ChainCollectFirstBench.collectFirstSomeMediumFoldable  thrpt   10    376093.351 ±  2306.020  ops/s
ChainCollectFirstBench.collectFirstSomeMediumNew       thrpt   10   1250584.212 ±  8875.084  ops/s
ChainCollectFirstBench.collectFirstSome2MediumNew      thrpt   10   1412017.858 ± 13673.273  ops/s

ChainCollectFirstBench.collectFirstSomeLargeFoldable   thrpt   10    392443.540 ±  2350.104  ops/s
ChainCollectFirstBench.collectFirstSomeLargeNew        thrpt   10   1243549.960 ± 13598.336  ops/s
ChainCollectFirstBench.collectFirstSome2LargeNew       thrpt   10   1399553.286 ±  7312.677  ops/s
  • *Foldable is the default implementation from Foldable.
  • *New is new implementation from my PR
  • collectFirstSome2*New is collectFirstSome based on foreachUntil and without Function.unlift

Results

  1. Implementations of collectFirst/collectFirstSome from my PR is few times faster than defaults for any Chain size.
  2. collectFirstSome based on collectFirst and Function.unlift is ~2 times slower than collectFirstSome based on foreachUntil for the small chains.
  3. collectFirstSome based on collectFirst and Function.unlift has almost same performance as collectFirstSome based on foreachUntil for the medium and large chains.

Conclusion

I changed collectFirstSome implementation to foreachUntil. I added a new commit to this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! thanks so much for this contribution.


/**
* Remove elements not matching the predicate
*/
Expand Down Expand Up @@ -577,6 +584,8 @@ sealed abstract private[data] class ChainInstances extends ChainInstances1 {
override def forall[A](fa: Chain[A])(p: A => Boolean): Boolean = fa.forall(p)
override def find[A](fa: Chain[A])(f: A => Boolean): Option[A] = fa.find(f)
override def size[A](fa: Chain[A]): Long = fa.length
override def collectFirst[A, B](fa: Chain[A])(pf: PartialFunction[A, B]): Option[B] = fa.collectFirst(pf)
override def collectFirstSome[A, B](fa: Chain[A])(f: A => Option[B]): Option[B] = fa.collectFirstSome(f)

def coflatMap[A, B](fa: Chain[A])(f: Chain[A] => B): Chain[B] = {
@tailrec def go(as: Chain[A], res: ListBuffer[B]): Chain[B] =
Expand Down
18 changes: 15 additions & 3 deletions core/src/main/scala/cats/data/NonEmptyChain.scala
Original file line number Diff line number Diff line change
Expand Up @@ -240,11 +240,17 @@ class NonEmptyChainOps[A](private val value: NonEmptyChain[A]) extends AnyVal {
final def collect[B](pf: PartialFunction[A, B]): Chain[B] = toChain.collect(pf)

/**
* Finds the first element of this `NonEmptyChain` for which the given partial
* function is defined, and applies the partial function to it.
*/
* Finds the first element of this `NonEmptyChain` for which the given partial
* function is defined, and applies the partial function to it.
*/
final def collectFirst[B](pf: PartialFunction[A, B]): Option[B] = toChain.collectFirst(pf)

/**
* Like `collectFirst` from `scala.collection.Traversable` but takes `A => Option[B]`
* instead of `PartialFunction`s.
*/
final def collectFirstSome[B](f: A => Option[B]): Option[B] = toChain.collectFirstSome(f)

/**
* Filters all elements of this chain that do not satisfy the given predicate.
*/
Expand Down Expand Up @@ -479,6 +485,12 @@ sealed abstract private[data] class NonEmptyChainInstances extends NonEmptyChain

override def toNonEmptyList[A](fa: NonEmptyChain[A]): NonEmptyList[A] =
fa.toNonEmptyList

override def collectFirst[A, B](fa: NonEmptyChain[A])(pf: PartialFunction[A, B]): Option[B] =
fa.collectFirst(pf)

override def collectFirstSome[A, B](fa: NonEmptyChain[A])(f: A => Option[B]): Option[B] =
fa.collectFirstSome(f)
}

implicit def catsDataOrderForNonEmptyChain[A: Order]: Order[NonEmptyChain[A]] =
Expand Down