Skip to content

Conversation

@m-renaud
Copy link
Collaborator

@m-renaud m-renaud commented Jan 23, 2018

This function is present in the container's Map interface, Map.insertWithKey, this brings more consistency to the HashMap interface so that it can more easily be used as a drop in replacement for Data.Map if desired.

This partially addresses #172.

This function is present in the container's Map interface, this brings more
consistency to the HashMap interface so that it can more easily be used as a
drop in replacement for Data.Map if desired.

This partially addresses haskell-unordered-containers#172.
@m-renaud
Copy link
Collaborator Author

Currently working on benchmarks, there aren't any existing benchmarks for insertWith so I'll be adding those as well.

pInsertWithKey :: Key -> [(Key, Int)] -> Bool
pInsertWithKey k = M.insertWithKey f k 1 `eq_` HM.insertWithKey f k 1
where f (K k') a b = if k' >= 0 then a else b

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither this nor the pre-existing pInsertWith is particularly nice. They both need a lot of thinking to decide if they test enough, and pInsertWith definitely does not. I think we want something like

pInsertWithKey :: Fun (Key, A, A) A -> Key -> A -> [(Key, A)] -> Bool

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated both of them to use Fun.

| otherwise = isBottom $ HM.insertWith f k bottom m

pInsertWithKeyKeyStrict :: (Key -> Int -> Int -> Int) -> Int -> HashMap Key Int -> Bool
pInsertWithKeyKeyStrict f v m = isBottom $ HM.insertWithKey f bottom v m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Fun (Key, Int, Int) Int.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, although it complained about instances for Key so I used Int as the first param and used unK to pull out the int. I can change to Key and add necessary instances if you'd like.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thought these comments were for HashMapProperties, changing here as well.


pInsertWithKeyValueStrict :: (Key -> Int -> Int -> Int) -> Key -> Int -> HashMap Key Int
-> Bool
pInsertWithKeyValueStrict f k v m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same deal here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

We use apply instead of apply2 and apply3 because the latter two are only
available in recent QuickCheck versions. We also use (Int, Int, Int) -> Int
for the function for insertWithKey so we don't need to add additional instances
to Key to work with `Fun`.
Copy link
Collaborator

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should probably use A for the values instead of Int, for clarity.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018

Missing what instances? Are those instances somewhere else?

@m-renaud
Copy link
Collaborator Author

/home/mrenaud/projects/unordered-containers/tests/HashMapProperties.hs:321:9: error:
    • No instance for (Test.QuickCheck.Function.Function Key)
        arising from a use of ‘testProperty’
    • In the expression: testProperty "insertWithKey" pInsertWithKey
      In the second argument of ‘testGroup’, namely
        ‘[testProperty "size" pSize, testProperty "member" pMember,
          testProperty "lookup" pLookup, testProperty "insert" pInsert,
          ....]’
      In the expression:
        testGroup
          "basic interface"
          [testProperty "size" pSize, testProperty "member" pMember,
           testProperty "lookup" pLookup, testProperty "insert" pInsert, ....]

I may be using Fun wrong, but that's the error when changing the first tuple element from Int to Key. I'll keep playing around with it.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

@m-renaud
Copy link
Collaborator Author

Instance issue fixed, the insertWithKey value strictness check is failing though, I'll have to look more closely.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

TODO: Fix pInsertWithKeyStrict.

[ci skip]
@m-renaud
Copy link
Collaborator Author

You can't double curry like that--this is Haskell, not Idris!

:(

I'm having issues with dependency solving when changing the lower bound. I'll try again tomorrow.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

This function is only available in new versions of QuickCheck which we may not
have for old GHC versions. This inlines the function here for portability.

Skipping CI since there's still a known failure (pInsertWithKeyValueStrict).

[ci skip]
insertWith is implemented using insertWithKey so other than noise they should
have similar results.

benchmarking HashMap/insertWith/String
time                 1.932 ms   (1.771 ms .. 2.124 ms)
                     0.954 R²   (0.923 R² .. 0.983 R²)
mean                 1.750 ms   (1.664 ms .. 1.872 ms)
std dev              301.9 μs   (227.4 μs .. 410.5 μs)
variance introduced by outliers: 87% (severely inflated)

benchmarking HashMap/insertWith/ByteString
time                 2.272 ms   (2.194 ms .. 2.347 ms)
                     0.978 R²   (0.951 R² .. 0.994 R²)
mean                 1.840 ms   (1.744 ms .. 1.962 ms)
std dev              331.1 μs   (266.5 μs .. 486.0 μs)
variance introduced by outliers: 88% (severely inflated)

benchmarking HashMap/insertWith/Int
time                 1.321 ms   (1.241 ms .. 1.454 ms)
                     0.955 R²   (0.932 R² .. 0.976 R²)
mean                 1.365 ms   (1.316 ms .. 1.420 ms)
std dev              178.3 μs   (150.3 μs .. 230.6 μs)
variance introduced by outliers: 81% (severely inflated)

benchmarking HashMap/insertWithKey/String
time                 1.783 ms   (1.714 ms .. 1.900 ms)
                     0.959 R²   (0.934 R² .. 0.983 R²)
mean                 1.614 ms   (1.526 ms .. 1.747 ms)
std dev              289.4 μs   (217.9 μs .. 372.9 μs)
variance introduced by outliers: 89% (severely inflated)

benchmarking HashMap/insertWithKey/ByteString
time                 1.647 ms   (1.605 ms .. 1.688 ms)
                     0.992 R²   (0.983 R² .. 0.997 R²)
mean                 1.563 ms   (1.524 ms .. 1.644 ms)
std dev              164.4 μs   (85.16 μs .. 308.3 μs)
variance introduced by outliers: 71% (severely inflated)

benchmarking HashMap/insertWithKey/Int
time                 1.313 ms   (1.289 ms .. 1.335 ms)
                     0.994 R²   (0.990 R² .. 0.997 R²)
mean                 1.289 ms   (1.270 ms .. 1.338 ms)
std dev              83.74 μs   (50.75 μs .. 157.2 μs)
variance introduced by outliers: 49% (moderately inflated)
Skipping CI, known test failure being investigated.

[ci skip]
@m-renaud
Copy link
Collaborator Author

Still getting this strictness property failure:

  insertWithKey is value-strict: [Failed]
*** Failed! Falsifiable (after 1 test and 3 shrinks): 
{_->0}
K {unK = 0}
0
fromList []

It's weird because insertWith uses the same implementation but those tests pass :/

@m-renaud
Copy link
Collaborator Author

m-renaud commented Jan 23, 2018

Oh duh, the Strict module has its own implementations of insertWithKey, it doesn't re-export the Base implementation.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018

You still need to benchmark old vs. new implementations of insertWith (particularly the strict versions) if you want to change that. Or you can just tolerate code duplication.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018

But benchmarking is definitely preferred.

@m-renaud
Copy link
Collaborator Author

Old implementation (code duplication):

    benchmarking HashMap/insertWith/String
    time                 2.124 ms   (2.065 ms .. 2.187 ms)
                         0.990 R²   (0.984 R² .. 0.994 R²)
    mean                 1.852 ms   (1.778 ms .. 1.924 ms)
    std dev              222.5 μs   (191.1 μs .. 266.3 μs)
    variance introduced by outliers: 75% (severely inflated)
    
    benchmarking HashMap/insertWith/ByteString
    time                 2.396 ms   (2.237 ms .. 2.554 ms)
                         0.975 R²   (0.966 R² .. 0.985 R²)
    mean                 2.031 ms   (1.922 ms .. 2.140 ms)
    std dev              331.9 μs   (264.6 μs .. 469.7 μs)
    variance introduced by outliers: 84% (severely inflated)
    
    benchmarking HashMap/insertWith/Int
    time                 1.252 ms   (1.162 ms .. 1.347 ms)
                         0.976 R²   (0.963 R² .. 0.993 R²)
    mean                 1.137 ms   (1.108 ms .. 1.178 ms)
    std dev              103.3 μs   (73.73 μs .. 165.2 μs)
    variance introduced by outliers: 67% (severely inflated)

New implementation (in terms of insertWithKey):

    benchmarking HashMap/insertWith/String
    time                 1.932 ms   (1.771 ms .. 2.124 ms)
                         0.954 R²   (0.923 R² .. 0.983 R²)
    mean                 1.750 ms   (1.664 ms .. 1.872 ms)
    std dev              301.9 μs   (227.4 μs .. 410.5 μs)
    variance introduced by outliers: 87% (severely inflated)
    
    benchmarking HashMap/insertWith/ByteString
    time                 2.272 ms   (2.194 ms .. 2.347 ms)
                         0.978 R²   (0.951 R² .. 0.994 R²)
    mean                 1.840 ms   (1.744 ms .. 1.962 ms)
    std dev              331.1 μs   (266.5 μs .. 486.0 μs)
    variance introduced by outliers: 88% (severely inflated)
    
    benchmarking HashMap/insertWith/Int
    time                 1.321 ms   (1.241 ms .. 1.454 ms)
                         0.955 R²   (0.932 R² .. 0.976 R²)
    mean                 1.365 ms   (1.316 ms .. 1.420 ms)
    std dev              178.3 μs   (150.3 μs .. 230.6 μs)
    variance introduced by outliers: 81% (severely inflated)

The differences appear to be due to noise.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

@m-renaud
Copy link
Collaborator Author

Can you try applying less noise?

Could you elaborate? I'm not sure exactly how I would do that, I've only used criterion a few times in the past. Is there an option to increase the number of runs? I went through the tutorial and some docs but didn't find anything.

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

@treeowl
Copy link
Collaborator

treeowl commented Jan 23, 2018 via email

@m-renaud
Copy link
Collaborator Author

I mostly meant using a quieter system

Haha, gotcha :P

@m-renaud
Copy link
Collaborator Author

m-renaud commented Jan 23, 2018

Updated benchmark results. TL;DR; no difference.

Old (code duplication):

benchmarking HashMap/insertWith/String
time                 1.635 ms   (1.604 ms .. 1.665 ms)
                     0.996 R²   (0.994 R² .. 0.998 R²)
mean                 1.663 ms   (1.636 ms .. 1.695 ms)
std dev              102.2 μs   (81.08 μs .. 130.7 μs)
variance introduced by outliers: 46% (moderately inflated)
             
benchmarking HashMap/insertWith/ByteString
time                 1.634 ms   (1.610 ms .. 1.662 ms)
                     0.997 R²   (0.995 R² .. 0.998 R²)
mean                 1.643 ms   (1.614 ms .. 1.666 ms)
std dev              90.85 μs   (73.72 μs .. 110.7 μs)
variance introduced by outliers: 41% (moderately inflated)
             
benchmarking HashMap/insertWith/Int
time                 1.060 ms   (1.052 ms .. 1.069 ms)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 1.063 ms   (1.054 ms .. 1.076 ms)
std dev              38.48 μs   (29.12 μs .. 54.26 μs)
variance introduced by outliers: 25% (moderately inflated)

New (implemented via insertWithKey):

benchmarking HashMap/insertWith/String
time                 1.687 ms   (1.641 ms .. 1.729 ms)
                     0.994 R²   (0.990 R² .. 0.997 R²)
mean                 1.716 ms   (1.681 ms .. 1.748 ms)
std dev              112.6 μs   (91.79 μs .. 141.6 μs)
variance introduced by outliers: 49% (moderately inflated)
             
benchmarking HashMap/insertWith/ByteString
time                 1.649 ms   (1.621 ms .. 1.679 ms)
                     0.997 R²   (0.996 R² .. 0.998 R²)
mean                 1.650 ms   (1.628 ms .. 1.672 ms)
std dev              69.74 μs   (57.69 μs .. 87.52 μs)
variance introduced by outliers: 28% (moderately inflated)
             
benchmarking HashMap/insertWith/Int
time                 1.081 ms   (1.072 ms .. 1.090 ms)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 1.084 ms   (1.073 ms .. 1.095 ms)
std dev              37.65 μs   (30.82 μs .. 47.29 μs)
variance introduced by outliers: 23% (moderately inflated)

@treeowl
Copy link
Collaborator

treeowl commented Jan 25, 2018 via email

@m-renaud
Copy link
Collaborator Author

m-renaud commented Jan 25, 2018

sorry to do this to you after all that work

Haha, no worries :)

doesn't deserve to exist in containers either

I'm assuming because you can just write your function and partially apply the key?

Map.insertWith (f k) k v m
  where f k a b = ...

I was kinda wondering why this was its own function since it didn't seem to do any internal optimization.

@treeowl
Copy link
Collaborator

treeowl commented Jan 25, 2018 via email

@m-renaud
Copy link
Collaborator Author

I completely agree. Can we deprecate and remove the function from the containers API?!

@treeowl
Copy link
Collaborator

treeowl commented Jan 25, 2018 via email

@m-renaud
Copy link
Collaborator Author

m-renaud commented Jan 4, 2019

Closing this since we instead plan to deprecate this function in containers.

@m-renaud m-renaud closed this Jan 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants