Skip to main content


A Leader in Astrostatistics

In the 1980s Jogesh Babu in the Department of Statistics and Eric Feigelson in the Department of Astronomy began collaborating on bringing cutting-edge statistical methods to answer important questions in astronomy. While there had been a few other collaborations between statisticians and astronomers, notably Neyman and Scott in the 1950s, the Babu-Feigelson collaboration is among the earliest and most sustained since the revolution in statistics brought about by computer-intensive methods like the bootstrap.

The Babu-Feigelson collaboration led to a 1996 cross-disciplinary monograph that gave rise to the name Astrostatistics. This collaboration led to the establishment of  cross-disciplinary Center for Astrostatistics (CASt) in 2003.

Over the years the number of astrostatistics faculty has grown. We now have several faculty in both departments who are actively engaged in various ways with this fast-growing interdisciplinary research area: Hyungsuk Tak (Statistics), Eric Ford (Astronomy), Joel Leja (Astronomy), Derek Fox (Astronomy), David Hunter (Statistics), Donghui Jeong (Astronomy), and Rebekah Dawson (Astronomy). The very popular Penn State Astrostatistics Summer School has, over close to two decades, educated hundreds of astronomers on statistical methods. Penn State has also been the host to numerous summer astrostatistics workshops and conferences, bringing together leading researchers from around the world.


Distinguished Senior Scholar and Professor

Distinguished Professor and ICDS Co-Hire

Lee M. Hammarstrom-Dr. Keiko Miwa Ross Early Career Assistant Professor and ICDS Co-Hire

Shaffer Career Development Professor

Faculty and Student Research Collaborations

Identifying Galaxies with Unique Data Analysis

Assistant Professor Hyungsuk Tak's collaboration with  graduate student Sarah Shy focuses on developing a new data analytic tool to quantify classification uncertainties with statistical and machine learning methods, such as random forests and support vector machines, in the unique context of astronomical data. They have successfully applied their method to identify a specific type of galaxy (high redshift quasars) from a large-scale data set with millions of astronomical objects.  The challenge of this problem lies in the fact that the targeted objects are presumed to be dimmer (more uncertain) than other objects.

Their goal is to scale up the method so that it can be applied to even larger rectangular data sets with billions of lines (astronomical objects) with hundreds of columns (properties) in the near future because the Rubin Observatory Legacy Survey of Space and Time (known as LSST) will start monitoring the entire sky in a few years, producing terabytes of data per day.