Community detection is an important task in network analysis, in which we aim to learn a network partition that groups together vertices with similar community-level connectivity patterns. By finding such groups of vertices with similar structural roles, we extract a compact representation of the network's large-scale structure, which can facilitate its scientific interpretation and the prediction of unknown or future interactions. Popular approaches, including the stochastic block model (SBM), assume edges are unweighted, which limits their utility by discarding potentially useful information. We introduce the weighted stochastic block model (WSBM), which generalizes the SBM to networks with edge weights drawn from any exponential family distribution. This model learns from both the presence and weight of edges, allowing it to discover structure that would otherwise be hidden when weights are discarded or thresholded. We describe a Bayesian variational algorithm for efficiently approximating this model's posterior distribution over latent block structures. We then evaluate the WSBM's performance on both edge-existence and edge-weight prediction tasks for a set of real-world weighted networks. In all cases, the WSBM performs as well or better than the best alternatives on these tasks.
Keywords: community detection; weighted relational data; block models; exponential family; variational Bayes
Journal Article. 11546 words. Illustrated.
Subjects: Mathematics ; Computer Science
Full text: subscription required