Multi-threaded OpenMP-based Louvain algorithm for community detection.
The Louvain method is a greedy modularity optimization based agglomerative
algorithm by Blondel et al. that efficiently identifies high-quality
communities in a graph. With O(n \log n)
average time complexity (n
being
the number of vertices in the graph), it consists of two phases: local-moving
and aggregation. In the local-moving phase, each vertex greedily selects the
neighboring community with the highest modularity increase. The aggregation
phase merges vertices within a community into a super-vertex. This iterative
process produces a dendrogram hierarchy of community memberships, with the
top-level representing the final output.
It appears schedule(dynamic, 2048)
is the best choice.
It appears allowing a maximum of 20
iterations is ok.
Instead of using a fixed tolerance across all passes of the Louvain algorithm,
we can start with a high tolerance and then gradually reduce it. This is
called threshold scaling, and it helps minimize runtime as the first pass is
usually the most expensive. It appears a drop rate of 10
is ok.
As mentioned above, we can start with an initial high tolerance. It appears a
tolerance of 0.01
is suitable.
This controls when to consider communities to have converged based on the number
of community merges. That is if too few communties merged this pass, we should
stop here. Or, |Vsuper-vertices|/|V| >= aggegation tolerance
=> Converged.
It appears an aggregation tolerance of 0.8
might be the best choice.
When a vertex changes its community, its marks its neighbors to be processed. Once a vertex has been processed, it is marked as not to be processed. It helps minimize unnecessary computation.
It appears with vertex pruning we get better timings.
I originally used a simple vector2d for storing vertices belonging to each community. But this requires memory allocation during the algorithm, which is expensive. Using a prefix sum along with a preallocated CSR helps to to avoid repeated mallocs and frees.
It appears using prefix sum is quite faster than using vector2d.
I also originally used a vector2d based dynamic graph data structure for storing the super-vertex graph. Again, this requires memory allocation during the algorithm. Using two preallocated CSRs along with parallel prefix sum can help here.
It appears using preallocated CSRs is faster.
One can use std::map
s (C++ inbuilt map) as the hastable for Louvain. But
this has poor performance. So i use a key-list and a full-size values
vector (collision-free) we can dramatically improve performance. However if
the memory addresses of the hastables are nearby, even if each thread uses
its own hash table performance is not so high possibly due to false
cache-sharing (Close-KV). However if i ensure memory address are farther
away, the perf. improves (Far-KV).
It seems Far-KV has the best performance.
We combine the above optimizations and observe the performance of
OpenMP-based Louvain on a number of graphs. Observe that Static
Louvain completes in 6.2 s on sk-2005
(a graph with 3.8 billion edges).
Larger number of iterations/passes are required for graphs with lower average
degree (road networks such as asia_osm
and europe_osm
, and protein k-mer
graphs such as kmer_A2a
and kmer_V1r
).
Also note how graphs with poor community structure (such as com-LiveJournal
,
com-Orkut
) have a larger time / (n log n)
factor.
Finally note how most of the runtime of the algorithm is spent in the local-moving phase. Further, most of the runtime is spent in the first pass of the algorithm, which is the most expensive pass due to the size of the original graph (later passes work on super-vertex graphs).
The input data used for this experiment is available from the SuiteSparse Matrix Collection. This experiment was done with guidance from Prof. Kishore Kothapalli and Prof. Dip Sankar Banerjee.
- Fast unfolding of communities in large networks; Vincent D. Blondel et al. (2008)
- Community Detection on the GPU; Md. Naim et al. (2017)
- Scalable Static and Dynamic Community Detection Using Grappolo; Mahantesh Halappanavar et al. (2017)
- From Louvain to Leiden: guaranteeing well-connected communities; V.A. Traag et al. (2019)
- CS224W: Machine Learning with Graphs | Louvain Algorithm; Jure Leskovec (2021)
- The University of Florida Sparse Matrix Collection; Timothy A. Davis et al. (2011)