Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Chain#distinct and NonEmptyChain#distinct to match List#distinct and NonEmptyList#distinct #2579

Merged
merged 3 commits into from
Nov 6, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions core/src/main/scala/cats/data/Chain.scala
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import Chain._

import scala.annotation.tailrec
import scala.collection.immutable.SortedMap
import scala.collection.immutable.TreeSet
import scala.collection.mutable.ListBuffer

/**
Expand Down Expand Up @@ -358,6 +359,24 @@ sealed abstract class Chain[+A] {
iterX.hasNext == iterY.hasNext
}

/**
* Remove duplicates. Duplicates are checked using `Order[_]` instance.
*/
def distinct[AA >: A](implicit O: Order[AA]): Chain[AA] = {
implicit val ord = O.toOrdering

var alreadyIn = TreeSet.empty[AA]

foldLeft(Chain.empty[AA]) { (elementsSoFar, b) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question of curiosity. Why are you using TreeSet instead of pure scala's Set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done for symmetry with NonEmptyList#distinct but it's a good question. Set uses a HashSet with sets bigger than 4 elements. HashSet#contains has O(1) and TreeSet#contains has O(log(n)).

Changing Set would also remove the artificial dependency on Order. Was there a reason for this dependency when NonEmptyList#distinct was introduced?

Copy link
Contributor

@nasadorian nasadorian Nov 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote up a functional set implementation that should speed up distinct in the future and let it depend on Eq instead of Order. Closed issue here, was suggested to put up a PR on cats-collection instead. #2593

EDIT: Seems like this structure exists already in cats-collections, called ISet.

if (alreadyIn.contains(b)) {
elementsSoFar
} else {
alreadyIn += b
elementsSoFar :+ b
}
}
}

def show[AA >: A](implicit AA: Show[AA]): String = {
val builder = new StringBuilder("Chain(")
var first = true
Expand Down
18 changes: 18 additions & 0 deletions core/src/main/scala/cats/data/NonEmptyChain.scala
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,24 @@ class NonEmptyChainOps[A](val value: NonEmptyChain[A]) extends AnyVal {
/** Reverses this `NonEmptyChain` */
final def reverse: NonEmptyChain[A] =
create(toChain.reverse)

/**
* Remove duplicates. Duplicates are checked using `Order[_]` instance.
*/
final def distinct[AA >: A](implicit O: Order[AA]): NonEmptyChain[AA] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder. Dont you think that it would be better to to call toChain and then distinct on chain and then rebuild NonEmptyChain? It would remove duplication but I'm not sure if it will still be ok with performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other way around would be cheaper, something along the lines of NonEmptyChain.fromChain(this).fold(empty[A])(_.distinct.toChain)

implicit val ord = O.toOrdering

var alreadyIn = TreeSet(head: AA)

foldLeft(NonEmptyChain(head: AA)) { (elementsSoFar, b) =>
if (alreadyIn.contains(b)) {
elementsSoFar
} else {
alreadyIn += b
elementsSoFar :+ b
}
}
}
}

sealed abstract private[data] class NonEmptyChainInstances extends NonEmptyChainInstances1 {
Expand Down
6 changes: 6 additions & 0 deletions tests/src/test/scala/cats/tests/ChainSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -174,4 +174,10 @@ class ChainSuite extends CatsSuite {
}
}
}

test("Chain#distinct is consistent with List#distinct") {
forAll { a: Chain[Int] =>
a.distinct.toList should ===(a.toList.distinct)
}
}
}
6 changes: 6 additions & 0 deletions tests/src/test/scala/cats/tests/NonEmptyChainSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -133,4 +133,10 @@ class NonEmptyChainSuite extends CatsSuite {
ci.reverse.toChain should ===(ci.toChain.reverse)
}
}

test("NonEmptyChain#distinct is consistent with List#distinct") {
forAll { ci: NonEmptyChain[Int] =>
ci.distinct.toList should ===(ci.toList.distinct)
}
}
}