Description
dfs
is the core function in Data.Graph
that all other algorithms (topSort
, scc
, bcc
, etc) are based on. It takes a Graph
and generates a [Tree Vertex]
.
dfs
makes use of generate
, prune
and chop
to generate a [Tree Vertex]
and chop it into the returned [Tree Vertex]
.
Other functions further transform this [Tree Vertex]
, for example topSort
flattens the [Tree Vertex]
into a [Vertex]
.
These intermediate trees are unnecessary.
- The trees generated by
generate
are immediately chopped into other trees. - For
topSort
, the trees returned bydfs
are immediately turned into a list.
There is no reason we need to create these trees in memory.
I propose combining the existing generate
, prune
, and chop
function into a new core function that eliminates the intermediate trees from generate
and also allows us to choose what to build from the DFS.
dfsGraph :: (Vertex -> a -> b) -> (b -> a -> a) -> a -> Graph -> [Vertex] -> a
Then we can define, for example
dfs = dfsGraph Node (:) []
topSort = dfsGraph <build the list directly>
Benchmarks on a random graph with 10,000 vertices and 100,000 edges:
dfs: OK (0.90s)
3.51 ms ± 79 μs, 57% less than baseline
topSort: OK (0.47s)
3.60 ms ± 107 μs, 59% less than baseline
Other function such as scc
and bcc
should also benefit, I've not implemented them yet.
Thoughts? I can clean up my code into a PR if this sounds good.