Skip to content

Tests and benchmarks for Data.Graph #883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 19, 2022
Merged

Conversation

meooow25
Copy link
Contributor

Tests and benchmarks for the most useful functions in Data.Graph.

This will help evaluate changes like #882.

Benchmarks look like:

  buildG:            OK (0.72s)
    1.32 ms ±  71 μs, 3.0 MB allocated, 984 KB copied,  28 MB peak memory
  graphFromEdges:    OK (0.24s)
    33.0 ms ± 2.6 ms,  74 MB allocated, 7.1 MB copied,  28 MB peak memory
  dfs:               OK (0.32s)
    9.88 ms ± 390 μs,  17 MB allocated,  10 MB copied,  28 MB peak memory
  dff:               OK (0.61s)
    9.41 ms ± 276 μs,  20 MB allocated,  11 MB copied,  28 MB peak memory
  topSort:           OK (0.32s)
    9.65 ms ± 406 μs,  20 MB allocated,  12 MB copied,  28 MB peak memory
  scc:               OK (0.16s)
    9.47 ms ± 730 μs,  20 MB allocated,  12 MB copied,  28 MB peak memory
  bcc_small:         OK (0.52s)
    15.7 ms ± 529 μs,  29 MB allocated,  19 MB copied,  46 MB peak memory
  stronglyConnCompR: OK (0.52s)
    72.8 ms ± 1.6 ms, 133 MB allocated,  42 MB copied,  46 MB peak memory

@treeowl
Copy link
Contributor

treeowl commented Dec 19, 2022

W00t! Ping me when this passes CI so I can merge.

-- so we can keep things simple and run them on random graphs in benchmarks.

-- A graph with vertices [1..n] and m random edges.
buildRandomGraph :: Int -> Int -> G.Graph
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have several random and several less-random graphs, and run all benchmarks on each? I definitely won't hold up the PR for that (some benchmarks are better than none), but it might be good to think about for the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add a couple more sets of differently sized random graphs, I'm thinking n=100,m=1000, and perhaps n=100,m=10000 for a higher m/n ratio.
When it comes to non-random graphs there are so many choices I have no idea what we should include. Complete graphs, some preferred idea of dense/sparse graphs maybe, trees (of which there are more subtypes), etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those are all good ideas. They'll give various extremes likely to be sensitive to various different perf changes.

@jwaldmann
Copy link
Contributor

Would this affect compilation times in GHC? I found one usage of dfs, at https://gitlab.haskell.org/ghc/ghc/-/blob/master/compiler/GHC/Data/Graph/Directed.hs#L456 (I assume it is indeed dfs from this library here)

@treeowl
Copy link
Contributor

treeowl commented Dec 19, 2022

I can't remember if Cabal uses some of this too. Benchmarks based on real applications would be extra cool.

@meooow25
Copy link
Contributor Author

Added more random graphs.

Would this affect compilation times in GHC?

I'm assuming you mean the linked issue about optimization.
It would be affected, yes. That file is also using other dfs-based functions which should improve. But I don't know what fraction of time GHC spends on these graph operations, so I can't guess by how much compilation times would be affected.

Benchmarks based on real applications would be extra cool.

That would be good. Do we have such sample graphs? If you could point me to one I can try to add it. If not, I say we add the random graphs for now.

New benchmarks
 buildG
    n=100,m=1000:     OK (0.18s)
      7.98 μs ± 694 ns,  30 KB allocated, 115 B  copied,  27 MB peak memory
    n=100,m=10000:    OK (0.38s)
      76.7 μs ± 6.0 μs, 240 KB allocated, 6.7 KB copied,  27 MB peak memory
    n=10000,m=100000: OK (0.45s)
      1.64 ms ±  56 μs, 3.0 MB allocated, 958 KB copied,  29 MB peak memory
  graphFromEdges
    n=100,m=1000:     OK (0.67s)
      146  μs ± 8.5 μs, 415 KB allocated, 1.5 KB copied,  29 MB peak memory
    n=100,m=10000:    OK (0.21s)
      1.42 ms ± 139 μs, 3.8 MB allocated, 108 KB copied,  29 MB peak memory
    n=10000,m=100000: OK (0.11s)
      33.7 ms ± 2.9 ms,  73 MB allocated, 4.3 MB copied,  30 MB peak memory
  dfs
    n=100,m=1000:     OK (0.21s)
      16.8 μs ± 1.6 μs, 173 KB allocated, 255 B  copied,  35 MB peak memory
    n=100,m=10000:    OK (0.36s)
      143  μs ± 7.1 μs, 1.6 MB allocated, 2.6 KB copied,  35 MB peak memory
    n=10000,m=100000: OK (0.68s)
      10.6 ms ± 536 μs,  17 MB allocated,  11 MB copied,  40 MB peak memory
  dff
    n=100,m=1000:     OK (0.23s)
      18.7 μs ± 1.7 μs, 195 KB allocated, 284 B  copied,  40 MB peak memory
    n=100,m=10000:    OK (0.19s)
      145  μs ±  12 μs, 1.6 MB allocated, 2.6 KB copied,  40 MB peak memory
    n=10000,m=100000: OK (0.19s)
      11.2 ms ± 783 μs,  20 MB allocated,  12 MB copied,  40 MB peak memory
  topSort
    n=100,m=1000:     OK (0.39s)
      19.0 μs ± 1.5 μs, 204 KB allocated, 296 B  copied,  40 MB peak memory
    n=100,m=10000:    OK (0.21s)
      152  μs ±  11 μs, 1.6 MB allocated, 2.6 KB copied,  40 MB peak memory
    n=10000,m=100000: OK (3.04s)
      11.5 ms ± 198 μs,  20 MB allocated,  14 MB copied,  41 MB peak memory
  scc
    n=100,m=1000:     OK (0.30s)
      56.5 μs ± 5.1 μs, 572 KB allocated, 2.4 KB copied,  41 MB peak memory
    n=100,m=10000:    OK (0.32s)
      543  μs ±  21 μs, 4.9 MB allocated, 154 KB copied,  41 MB peak memory
    n=10000,m=100000: OK (0.44s)
      29.0 ms ± 1.8 ms,  57 MB allocated,  32 MB copied,  46 MB peak memory
  bcc
    n=100,m=1000:     OK (0.21s)
      78.8 μs ± 5.3 μs, 690 KB allocated, 6.0 KB copied,  46 MB peak memory
    n=100,m=10000:    OK (0.45s)
      384  μs ±  24 μs, 3.4 MB allocated,  14 KB copied,  46 MB peak memory
  stronglyConnCompR
    n=100,m=1000:     OK (0.29s)
      228  μs ±  22 μs, 991 KB allocated, 9.1 KB copied,  46 MB peak memory
    n=100,m=10000:    OK (0.35s)
      2.46 ms ± 215 μs, 8.7 MB allocated, 1.0 MB copied,  46 MB peak memory
    n=10000,m=100000: OK (0.24s)
      75.7 ms ± 2.9 ms, 133 MB allocated,  41 MB copied,  52 MB peak memory

@treeowl
Copy link
Contributor

treeowl commented Dec 19, 2022

I don't, no, and I agree we should add what we have now.

@treeowl
Copy link
Contributor

treeowl commented Dec 19, 2022

Would you like both commits preserved, or would you like to squash them, or would you like me to squash them in merge?

@meooow25
Copy link
Contributor Author

No point in preserving them I think, if you could squash them that would be great.

@treeowl treeowl merged commit 468aa9d into haskell:master Dec 19, 2022
@treeowl
Copy link
Contributor

treeowl commented Dec 19, 2022

Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants