Title: Efficient Closest Community Search over Large Graphs
Authors: Mingshen Cai and Lijun Chang
Abstract: This paper studies the closest community search problem. Given a graph G and a set of query vertices Q, the closest community of Q in G is the connected subgraph of G that contains Q, is most cohesive (i.e., with the largest possible minimum vertex degree), is closest to Q, and is maximal. We show that this can be computed via a two-stage approach: (1) compute the maximal connected subgraph g0 ofG that contains Q and is most cohesive, and (2) iteratively remove from g0 the vertex that is furthest to Q and subsequently also other vertices that violate the cohesiveness requirement. The last non-empty subgraph is the closest community of Q in G. We first propose baseline approaches for the two stages that run inO(n + m) andO(n0 x m0) time, respectively, where n (resp. n0) and m (resp. m0) are the number of vertices and edges in G (resp. g0). Then, we develop techniques to improve the time complexities of the two stages into O(n0 + m0) and O(m0 + n0 logn0), respectively. Moreover, we further design an algorithm CCS with the same time complexity as O(m0 +n0 logn0), but performs much better in practice. Extensive empirical studies demonstrate that CCS can efficiently compute the closest community over large graphs.