This talk focuses on Bayesian estimation of finite mixture models in a general setup where the number of components is unknown and allowed to grow with the sample size. An assumption on growing number of components is a natural one as the degree of heterogeneity present in the sample can grow and new components can arise as sample size increases, allowing full flexibility in modeling the complexity of data. This however will lead to a high-dimensional model which poses great challenges for estimation. We novelly employ the idea of data dependent prior in a Bayesian model and establish a number of important theoretical results. We first show that under mild conditions on the prior, the posterior distribution concentrates around the true mixing distribution at a near optimal rate with respect to the Wasserstein distance. Under a separation condition on the true mixing distribution, we further show that a better and adaptive convergence rate can be achieved, and the number of components can be consistently estimated. Furthermore, we derive optimal convergence rates for the higher-order mixture models where the number of components diverges arbitrarily fast. In addition, we suggest a simple recipe for using Dirichlet process (DP) mixture prior for estimating the finite mixture models and provide theoretical guarantees. In particular, we provide a novel solution for adopting the number of a cluster in a DP mixture model as an estimate of the number of components in a finite mixture model. Simulation study and real data applications are carried out demonstrating the utilities of our method.
For more information about Lizhen visit: https://www3.nd.edu/~llin4/