Skip to content

Add difference for Map #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 4, 2018
Merged

Conversation

mschristiansen
Copy link
Contributor

  • Data.Map.difference
  • QuickCheck tests

log "difference"
quickCheck $ \(TestMap m1) (TestMap m2) ->
let d = M.difference m1 m2 :: M.Map SmallKey Int
in and (map (\k -> M.member k d) (A.fromFoldable $ M.keys d)) &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to write m1 instead of the last d on this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed it. Want me to squash and force push or fine like this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, that should indeed have been "first d".

I don't have the commit bit on this repo, but I'd assume someone would just hit the squash button.

I just added difference to -unordered-collections, fwiw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at the implementation in Haskell (below). There they obviously walk the tree, which I guess is far more efficient. Implementation in this PR follow the style of the other functions. Any thoughts on performance?

difference :: Ord k => Map k a -> Map k b -> Map k a
difference Tip _   = Tip
difference t1 Tip  = t1
difference t1 (Bin _ k _ l2 r2) = case split k t1 of
  (l1, r1)
    | size l1l2 + size r1r2 == size t1 -> t1
    | otherwise -> link2 l1l2 r1r2
    where
      !l1l2 = difference l1 l2
      !r1r2 = difference r1 r2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would certainly be a lot faster. Repeatedly calling delete does a lot of duplicate work. Traversing the trees in parallel allows you to avoid retraversing parts of the tree while searching for elements to remove, avoid allocating upper nodes that will be replaced by a later deletion, and avoid looking for elements that can't possibly be there (like {1..100} \ {200..300}).

On the other hand, O(#right * log(#left)) is not terrible.

You probably know all of this.

So sure, it would be nice if it was faster, but IMHO that should not block merging.
...not that I have anything to say on the matter...

@@ -612,6 +613,11 @@ union = unionWith const
unions :: forall k v f. Ord k => Foldable f => f (Map k v) -> Map k v
unions = foldl union empty

-- | Difference of two maps. Return elements of the first map where
-- | the keys do not exist in the second map.
difference :: forall k v. Ord k => Map k v -> Map k v -> Map k v
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the values of the second map aren't used then this could be typed as:
forall k v w . Ord k => Map k v -> Map k w -> Map k v

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But should it be? I think the type as it is makes sense for a binary minus operator.
Then you can't by accident subtract a Map String Int from a Map String String, holes can help you find a matching pair, etc.

Someone previously made a similar choice for unionWith :: forall k v. Ord k => (v -> v -> v) -> Map k v -> Map k v -> Map k v which could well be unionWith :: forall k a b c. Ord k => (a -> b -> c) -> Map k a -> Map k b -> Map k c.

(And I think the case for generalizing unionWith is much stronger.)

@mschristiansen
Copy link
Contributor Author

I can't think of a case where I have used Map.difference with two maps with different value types, but can see that there could be such cases.

Also checked Haskell's containers and unordered-containers. They both use the more general type definition. E.g. Data.Map.Strict.difference

@mschristiansen
Copy link
Contributor Author

@LiamGoodacre Good on this one?

Copy link
Member

@LiamGoodacre LiamGoodacre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@mschristiansen
Copy link
Contributor Author

@garyb please merge if things are looking ok to you.

@mschristiansen
Copy link
Contributor Author

@fehrenbach merge?

@fehrenbach
Copy link

I don't have the power!

(If you don't need your maps to be ordered, difference for HashMap is in v1.3.0 of https://github.com/fehrenbach/purescript-unordered-collections)

@mschristiansen
Copy link
Contributor Author

This PR has been open for a 1.5 months! Very frustrating for me. What must I do?

@garyb garyb merged commit 5b21750 into purescript:master Oct 4, 2018
@garyb
Copy link
Member

garyb commented Oct 4, 2018

Thanks for working on this, and sorry it took so long to get merged, but please consider that none of us have unlimited time or are paid to do this or anything. Phrasing things the way you did in the last message is hardly the best way to make us inclined to do something, just a polite bump would have sufficed.

@mschristiansen mschristiansen deleted the map-difference branch October 4, 2018 17:55
@mschristiansen
Copy link
Contributor Author

@garyb Thanks for merging. You'll see above that I have tried several times to get this merged. Notified you directly 13 days ago and I have another PR on purescript-datetime that's also been sitting there long.

I know you and a few others are doing a lot of good work and there isn't quite enough of you to cover the number of libraries. Maybe give @fehrenbach the commit bit on this repo then the PR would have been merged the day after I submitted, and I would be happy to help as a reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants