Optimize Mutable.nextPermutation and add {next/prev}permutation(By) #498

gksato · 2024-07-21T13:46:37Z

This implements some optimization of nextPermutation from Data.Vector.Generic.Mutable, and supercedes #497. The main content of this re-implementation is the following two points:

Wrapping the whole implementation in stToPrim. This allows the compiler to optimize the code better.
When finding the rightmost increasing pair v[k]<v[k+1], we now search from the right, instead of from the left. This allows us to abort the search as soon as we find such a pair, giving average-case constant performance, instead of best-case linear in the previous implementation.

Also, this adds some clarification on doc comments; nextPermutation does not update the source vector when the given vector is the last permutation.

Update

This also adds the following API:

Data.Vector.*.Mutable.nextPermutationBy :: Constraint => (e -> e -> Ordering) -> v (PrimState m) a -> m Bool
Data.Vector.*.Mutable.prevPermutation :: (Constraint, Ord e) => v (PrimState m) a -> m Bool
Data.Vector.*.Mutable.prevPermutationBy :: Constraint => (e -> e -> Ordering) -> v (PrimState m) a -> m Bool

Shimuuar · 2024-07-21T15:17:10Z

Thanks!

I'll look into this tomorrow but at a glance nextPermutationByLt miss INLINE pragma which means it won't be specialized. Is this intentional?

gksato · 2024-07-21T15:19:32Z

I just aligned it with the original implementation; I assumed it was intentional. Is it better to add it before your review?

Shimuuar · 2024-07-21T15:23:35Z

I just aligned it with the original implementation

I see. I missed that somehow.

Is it better to add it before your review?

No need. I'll benchmark it and going to tweak code a bit anyway

Shimuuar · 2024-07-26T20:03:33Z

Yes. This is an optimization. I used very simple benchmark

permute :: Int -> Int -> IO (Vector Int)
permute sz n_it = do
  vec <- MV.generate sz id
  replicateM_ n_it $ MV.nextPermutation vec
  V.unsafeFreeze vec

With vector size 20 and n_it varying in range 1000-10000. See Shimuuar/vector branch bench-next-permutation for very quick and dirty code.

Benchmarks

I've run five benchmarks:

[00.Base] Baseline from current master
[01.strict] Added strictness. First change from make nextPermutation faster #497
[02.stToPrim] Added stToPrim
[03.INLINE] Added INLINE pragma
[04.gksato] Version from this PR

So for unboxed vector of ints stToPrim gives ~4× runtime speedup and your optimization another 3×. Of course inlined and specialized version beats them all.

Do you plan to include functions proposed in #499 in this PR?

gksato · 2024-07-26T23:15:24Z

Do you plan to include functions proposed in #499 in this PR?

I was not sure if it was appropriate to put optimization and API addition in one PR, but I'm happy to do it if you propose so! Should CHANGELOG be added now?

Clarified that `Data.Vector.*.Mutable.nextPermutation` does not update the vector when the original state is the last permutation.

This implements some optimization of `nextPermutation` from `Data.Vector.Generic.Mutable`. The main content of this re-implementation is the following two points: 1. Wrapping the whole implementation in `stToPrim`. This allows the compiler to optimize the code better. 2. When finding the rightmost increasing pair v[k]<v[k+1], we now search from the right, instead of from the left. This allows us to abort the search as soon as we find such a pair, giving average-case constant performance, instead of best-case linear in the previous implementation.

This adds the following three companions to the already existing `Data.Vector.*.Mutable.nextPermutation`: - `Data.Vector.*.Mutable.nextPermutationBy` - `Data.Vector.*.Mutable.prevPermutation` - `Data.Vector.*.Mutable.prevPermutationBy` They are all implemented in terms of the already existing `Data.Vector.Generic.Mutable.nextPermutationLt`.

gksato · 2024-07-27T06:40:51Z

Since I forgot to tweak the comments in Data.Vector.*.Mutablein 4ac750f, I'm going to force-push an updated history.

The function `Data.Vector.Generic.Mutable.nextPermutationByLt` is the unified implementation for the family of functions `Data.Vector.*.Mutable.{next,prev}Permutation{,By}`. By adding INLINE pragma to it, we may expect the some performance gain from specialization.

gksato · 2024-07-27T07:30:30Z

I added INLINE to the internal routine nextPermutationByLt, because cabal v2-bench with a line of bench "permute" $ whnfIO 20 useSize proved it 10x faster with INLINE (I haven't tweaked vector-bench-papi, because I'm on MacOS).

gksato · 2024-07-27T11:12:15Z

Oh, and I added:

Data.Vector.*.Mutable.nextPermutationBy :: Constraint => (e -> e -> Ordering) -> v (PrimState m) a -> m Bool
Data.Vector.*.Mutable.prevPermutation :: (Constraint, Ord e) => v (PrimState m) a -> m Bool
Data.Vector.*.Mutable.prevPermutationBy :: Constraint => (e -> e -> Ordering) -> v (PrimState m) a -> m Bool

The TODOs we have now are... tests, benchmarks, and changelog? Are they all?

This adds the following three tests for the pair of functions `Data.Vector.*.Mutable.{next/prev}Permutation` in `vector/tests/Tests/Move.hs`: 1. `testRevPermutations`: For n=1,..,7, repeatedly applying `prevPermutation` to a vector `[n,n-1..1]` produces all n! permutations of the vector in reverse order, and applying the function to the lexicographically smallest permutation doesn't change the vector. 2. `testNPPermutationsIsId`: Applying `nextPermutation` followed by `prevPermutation` to a vector produces the original vector. Note that this function uses modified versions of `nextPermutation` and `prevPermutation` that reverse the vector if the original function returns `False`, rendering those two functions bijective. 3. `testPNPermutationsIsId`: Applying `prevPermutation` followed by `nextPermutation` to a vector produces the original vector. The same caveat as above applies here.

gksato · 2024-08-03T06:29:24Z

I've written some tests on a separate branch (gksato:optimize-nextperm-tidyup), since I don't know if it is wanted. I'd fast-forward the branch (gksato:optimize-nextperm) of this pull request to include that commit whenever any of you request it!

Shimuuar · 2024-08-17T15:09:11Z

I finally found time to review PR. Please add tests and this PR is good to go. It would be nice to add changelog entry as well and whether you want to add benchmarks or not is up to you.

gksato · 2024-08-17T15:21:10Z

@Shimuuar OK, I'll do that soon. I've been writing some benchmark, so I might add it if implementation doesn't take much time. For the purpose of the changelog and @since pragma, what should be the version? I assume 0.13.2.0, but I'd like to be sure that it'll not be 0.13.1.1?

Shimuuar · 2024-08-17T15:50:12Z

Yes 0.13.2.0. Adding new definitions to module requires at least minor bump according to PVP. Patch releases are for bugfixes and documentation

This adds a changelog entry and `@since` annotations for: - Optimization of `Data.Vector.Generic.Mutable.nextPermutation` - Addition of `Data.Vector.Generic.Mutable.prevPermutation(By)` - Addition of `Data.Vector.Generic.Mutable.nextPermutationBy` This also tweaks the haddock comments of the functions for `Data.Vector.*.Mutable.(next|prev)Permutation(By)?` for a better readability.

Implement benchmarks to test performance of nextPermutation and prevPermutation on mutable vectors. Tests include: - Looping through all permutations on small vectors - Applying bijective versions n times on: - Ascending permutations of size n - Descending permutations of size n - Random permutations of size n - For a baseline, copying a vector of size n once. Benchmarks for bijective permutations begins with such a copy, and you might want to remove the impact of copying from the results. Benchmarks cover both forward (next) and reverse (prev) operations.

gksato · 2024-08-18T01:56:07Z

Added tests, a changelog entry, @since annotations, and benchmarks. Force-pushed some commit message rewording because @since without quotes refers to someone's name on a GitHub!

Note that vector-bench-papi hasn't been updated, because they say it doesn't go nicely on MacOS (I haven't tried it myself). However, I assume that applying the same diff as vector/benchmarks/Main.hs to vector-bench-papi/benchmarks/Main.hs should work.

Shimuuar · 2024-08-19T18:53:42Z

That's quite formidable battery of tests all right! It's more thorough than rest of benchmark suite :)

I'm unhappy with inlining of worker function. It's too large. But we're running into GHC's limitations since we don't way to ask GHC to generate specializations for it and its callers automatically.

Excellent pull request. Thank you!

Note that vector-bench-papi hasn't been updated

Yes ..-papi package should modified same way. And that's the point when you start to want to use backpack in order to be able to swap bencmarking engine. I implemented poor man's version in math-functions. Sadly backpack never caught on

P.S. @since. That poor guy. He must hate haskell.

gksato · 2024-08-19T22:24:06Z

I appreciate your work on review and merge, thank you!

Oh, yes. I haven't noticed this would of course be a nice use case for backpack. I imagined what if we try to comply DRY without backpack... urgh, it's a mess!

Sadly backpack never caught on

I truly agree with you.

toyboot4e mentioned this pull request Jul 21, 2024

make nextPermutation faster #497

Closed

gksato mentioned this pull request Jul 21, 2024

Feat: add prevPermutation, nextPermutationBy and a version of nextPermutation that goes back to the first perm? #499

Open

gksato added 3 commits July 27, 2024 14:51

doc refactor: Mutable.nextPermutation

310cb7c

Clarified that `Data.Vector.*.Mutable.nextPermutation` does not update the vector when the original state is the last permutation.

gksato force-pushed the optimize-nextperm branch from 272fa34 to 063c8ef Compare July 27, 2024 07:24

gksato changed the title ~~Optimize Data.Vector.Generic.Mutable.nextPermutation~~ Optimize Mutable.nextPermutation and add {next/prev}permutation(By) Jul 27, 2024

gksato added 2 commits August 18, 2024 10:48

gksato force-pushed the optimize-nextperm branch from 8232899 to 7415257 Compare August 18, 2024 01:49

Shimuuar merged commit eb60526 into haskell:master Aug 19, 2024
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Mutable.nextPermutation and add {next/prev}permutation(By) #498

Optimize Mutable.nextPermutation and add {next/prev}permutation(By) #498

gksato commented Jul 21, 2024 •

edited

Loading

Shimuuar commented Jul 21, 2024

gksato commented Jul 21, 2024

Shimuuar commented Jul 21, 2024

Shimuuar commented Jul 26, 2024

gksato commented Jul 26, 2024

gksato commented Jul 27, 2024 •

edited

Loading

gksato commented Jul 27, 2024 •

edited

Loading

gksato commented Jul 27, 2024

gksato commented Aug 3, 2024 •

edited

Loading

Shimuuar commented Aug 17, 2024

gksato commented Aug 17, 2024 •

edited

Loading

Shimuuar commented Aug 17, 2024

gksato commented Aug 18, 2024

Shimuuar commented Aug 19, 2024

gksato commented Aug 19, 2024

Optimize Mutable.nextPermutation and add {next/prev}permutation(By) #498

Optimize Mutable.nextPermutation and add {next/prev}permutation(By) #498

Conversation

gksato commented Jul 21, 2024 • edited Loading

Update

Shimuuar commented Jul 21, 2024

gksato commented Jul 21, 2024

Shimuuar commented Jul 21, 2024

Shimuuar commented Jul 26, 2024

Benchmarks

gksato commented Jul 26, 2024

gksato commented Jul 27, 2024 • edited Loading

gksato commented Jul 27, 2024 • edited Loading

gksato commented Jul 27, 2024

gksato commented Aug 3, 2024 • edited Loading

Shimuuar commented Aug 17, 2024

gksato commented Aug 17, 2024 • edited Loading

Shimuuar commented Aug 17, 2024

gksato commented Aug 18, 2024

Shimuuar commented Aug 19, 2024

gksato commented Aug 19, 2024

gksato commented Jul 21, 2024 •

edited

Loading

gksato commented Jul 27, 2024 •

edited

Loading

gksato commented Jul 27, 2024 •

edited

Loading

gksato commented Aug 3, 2024 •

edited

Loading

gksato commented Aug 17, 2024 •

edited

Loading