-
Notifications
You must be signed in to change notification settings - Fork 185
Time and memory efficient Tarjan's algorithm for strongly connected components in a directed graph #1182
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1182 +/- ##
==========================================
- Coverage 99.62% 99.62% -0.01%
==========================================
Files 93 93
Lines 4218 4217 -1
==========================================
- Hits 4202 4201 -1
Misses 16 16 |
@sinhatushar is this implementation now faster than #1173 ? |
Yes, this implementation is slightly faster than #1173. Tarjan's algorithm in theory is faster than Kosaraju's algorithm. However the current implementation(Tarjan's algorithm) wasn't working as fast as expected which has been fixed in this PR. |
Slightly faster |
src/connectivity.jl
Outdated
index = zeros(T, nvg) # first time in which vertex is discovered | ||
stack = Vector{T}() # stores vertices which have been discovered and not yet assigned to any component | ||
stack = T[] # stores vertices which have been discovered and not yet assigned to any component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this convention in LightGraphs, that for creating empty arrays of a certain type we use
Vector{T}()
The outcome is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
src/connectivity.jl
Outdated
|
||
for s in vertices(g) | ||
|
||
dfs_stack = T[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still allocate a vector twice for each component. Once when creating component
and then another time, when we use reverse
.
I was willing to remove |
src/connectivity.jl
Outdated
julia> g=SimpleDiGraph(11) | ||
{11, 0} directed simple Int64 graph | ||
|
||
julia> edge_list=[(1,2),(2,3),(3,4),(4,1),(3,5),(5,6),(6,7),(7,5),(5,8),(8,9),(9,8),(10,11),(11,10)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you write julia> edge_list=[(1,2),(2,3),(3,4),(4,1),(3,5),(5,6),(6,7),(7,5),(5,8),(8,9),(9,8),(10,11),(11,10)];
(with a semicolon instead), then you won't get any output, so maybe we can shorten the unnecessary long output in this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
This looks good to me, of course there would still be some ways to get some minor performance improvements, but the mayor bottleneck has been removed. |
@sbromberger , waiting for you to merge this if you find this PR suitable. @simonschoelly has approved these changes. |
src/connectivity.jl
Outdated
@@ -178,6 +178,9 @@ true | |||
""" | |||
is_weakly_connected(g) = is_connected(g) | |||
|
|||
|
|||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need all this whitespace here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
src/connectivity.jl
Outdated
index = zeros(T, nvg) # first time in which vertex is discovered | ||
stack = Vector{T}() # stores vertices which have been discovered and not yet assigned to any component | ||
stack = Vector{T}() # stores vertices which have been discovered and not yet assigned to any component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
src/connectivity.jl
Outdated
@@ -263,7 +285,7 @@ function strongly_connected_components end | |||
break | |||
end | |||
end | |||
push!(components, reverse(component)) | |||
push!(components, reverse!(component)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a hack. It relies on the fact that reverse!
returns the modified vector, but strictly speaking, it doesn't HAVE to since it's a mutating function. Better to call reverse!
component on the line before and use component
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!!
…omponents in a directed graph (#1182) * Make changes to make the algorithm more efficient. * Add one more example. * Remove unnecessary comments. * Use Vector{T}() instead of T[] . * Use reverse!() instead of reverse(). * Fix typo and remove redundant empty line * Shorten the example. * Make minor changes. * Final changes.
* first examples * Improve code coverage (#1183) * ✅ Improve test coverage of euclidean_graph * ✅ Improve test coverage for cycle_basis * ✅ Improve test coverage of graphmatrices.jl * 🐛 Fix some bugs in tests * 📝 Add some examples for random graph generators (#1180) #1142 * Time and memory efficient Tarjan's algorithm for strongly connected components in a directed graph (#1182) * Make changes to make the algorithm more efficient. * Add one more example. * Remove unnecessary comments. * Use Vector{T}() instead of T[] . * Use reverse!() instead of reverse(). * Fix typo and remove redundant empty line * Shorten the example. * Make minor changes. * Final changes. * randgraphs * smallgraphs * simpleedge
Making some small changes makes the current implementation of strongly_connected_components() 4-5 times faster and requires 500 times lesser memory.
Acknowledgement : @simonschoelly helped me in figuring out what is wrong with the current implementation.
Some benchmarks :