Agent-Based Models (ABMs) are used in several fields to study the evolution of complex systems based on micro-level assumptions. Often, some of their micro-level variables cannot be observed in empirical data. These latent variables make it difficult to initialize an ABM in order to use it to track and forecast empirical time series. In this paper, we propose a protocol to learn the latent variables of an ABM. We show how a complex ABM can be reduced to a probabilistic model, characterized by a computationally tractable likelihood. This reduction can be abstracted into two general design principles: balance of stochasticity and data availability, and replacement of unobservable discrete choices with differentiable approximations. We showcase our protocol by applying it to an ABM of the housing market, in which agents with different incomes bid higher prices to live in high-income neighborhoods. We show that the obtained model preserves the general behavior of the ABM, and at the same time it allows the estimation of latent variables through the optimization of its likelihood. In synthetic experiments, we show that we can learn the latent variables with good accuracy, and that our estimates make out-of-sample forecasting more precise compared to alternative benchmarks. Our protocol can be seen as an alternative to black-box data assimilation methods, forcing the modeler to lay bare the assumptions of the model, think about the inferential process, and identify potential identification problems.
Authors: Corrado Monti, Marco Pangallo, Gianmarco De Francisci Morales, Francesco Bonchi