Label propagation is one of the fastest methods for community detection with near linear time complexity. It is a local method where each node interacts with its neighbors to change its own label. Unfortunately, it has two major drawbacks. The first i...
Label propagation is one of the fastest methods for community detection with near linear time complexity. It is a local method where each node interacts with its neighbors to change its own label. Unfortunately, it has two major drawbacks. The first is a bad propagation, sometimes leading to huge communities without meaning (the giant communities problem). The second is related to its instability. Trials of a label propagation algorithm rarely give the same result. We propose to use a more stable variant of label propagation with a core method attached in order to obtain a more deterministic algorithm. This implementation will be done in a parallel and distributed environment on Hadoop using the MapReduce framework in order to apply this method with graphs having millions of nodes and edges. The main contribution of this paper is to model a parallel and distributed algorithm to achieve this purpose. A case study of the algorithm proposed is described at the end of the article along with the comparison of our results with other well‐known algorithms.