3:30 PM
4:30 PM
One of the fundamental problems in network data analysis is community detection that aims to partition nodes into cohesive communities. The stochastic block model, along with its variants, is one of the most studied statistical models for this purpose. Directly fitting the stochastic block model likelihood function on large-scale networks is known to be challenging. In this talk, I will discuss a pseudo likelihood approach that uses a new idea of “label decoupling” that permits an alternating maximization and can efficiently handle up to millions of nodes. The proposed method has provable convergence guarantee and enjoys good statistical properties. I will also briefly discuss testing for the number of communities in a stochastic block model and finally illustrate the usefulness of our methods through an analysis of international trade data.