Skip to main content
Research Areas

Astrostatistics &
Astroinformatics

Questions

  • How can collaborative research spanning astronomy, statistics, applied mathematics, and/or computer science enhance the capabilities of astronomical and astrophysical research?
  • How can advances in statistical and computational methodology significantly improve the accuracy, precision, or robustness of astronomical measurements and astrophysical models?
  • How can the rapidly evolving computer hardware and innovative algorithms from adjacent disciplines like Data Sciences, Machine Learning, and Artificial Intelligence be help improve the science coming from ground-based astronomical observatories such as the Atacama Large Millimeter/submillimeter Array (ALMA), the Hobby-Eberly Dark Energy Experiment (HETDEX), the IceCube Neutrino Observatory, and the Rubin Observatory (formerly LSST), as well as from space missions like the Kepler/K2 mission, the Transiting Exoplanet Survey Satellite (TESS), SphereX, the Roman Space Telescope, and the James Webb Space Telescope (JWST)?
  • How can we improve the sensitivity, accuracy and robustness of exoplanet detections and the precision of mass and radius measurements given astronomical data with correlated "noise" due to stellar variability? CASt faculty are leading research in time series analysis, Gaussian process models, autoregressive regression, efficient samplers for high-dimensional parameter spaces, and machine learning algorithms for the analysis of photometric time series and high-precision spectra.
  • How can we apply advanced statistical methods to large numbers of high-precision stellar light curves, such as those from Kepler/K2, TESS and Roman Space Telescope in order to characterize the underlying drivers of stellar photometric variability?
  • How can advanced statistical and machine learning techniques be applied to high-energy neutrino datasets to distinguish bona fide neutrino interactions from cosmic rays, and to distinguish between the different neutrino flavors and interactions?
  • How can we accurately characterize astronomical populations, while accounting for selection effects, detection limits and measurement biases? CASt faculty are incorporating a variety of techniques such as survival analysis, hierarchical Bayesian modeling, Approximate Bayesian Computing (ABC), and emulators for characterizing the luminosity function of young stars, exoplanet populations, black hole populations, and the early universe.
  • What is the optimal method of extending the three-dimensional clustering analysis of galaxies (based only on their spatial distribution) incorporating other observed properties such as their magnitudes, colors or spectra?
  • How can we understand the formation history of the galaxy population as a whole using only noisy pictures of individual galaxies scattered across cosmic time?
  • How can modern statistical and computational methods be effectively communicated to the broader astronomical community?


 

The Center for Astrostatistics (CASt) is a leader in cross-disciplinary education and research, promoting the responsible use of modern statistical methodology among astronomers and astrophysicists. 
 

Current Research Projects

  • Professor Dawson & Ford’s research groups are characterizing planetary populations from biased astronomical surveys and adapting modern statistical methods to compare exoplanet populations to predictions of planet formation models, using methods such as Hierarchical Bayesian models, Approximate Bayesian Computing, and emulation.
     
  • Professor Feigelson’s research group is applying advanced time series methods to detecting transiting planets in the presence of correlated noise using data from Kepler, TESS, and other photometric surveys.
     
  • Professor Ford’s research group is developing and testing modern Scientific Machine Learning (SciML) methods to detect and measure the masses of planets from spectroscopic surveys despite astrophysics jitter.
     
  • Professor Jeong’s research group is developing advanced methods for characterizing the 3-d clustering of galaxies to test models for cosmology and galaxy formation.
     
  • Professor Leja’s research group is applying hierarchical Bayesian analysis, neural net emulation, and modern simulation-based inference methods to characterize the physics and history of galaxy formation.
     
  • Professor Tak’s research group is applying modern time-series analysis methods to multi-band photometric time-series and adapting Machine Learning (ML) methods to accommodate specific needs of astronomical surveys such as heterogeneous measurement uncertainties.
     

Education Milestones

  • CASt faculty created and regularly teach advanced undergraduate and graduate courses to train astronomers in data sciences (ASTRO 497), computational methods (ASTRO 527), high-performance scientific computing (ASTRO 528), and astrostatistics (ASTRO 415 & ASTRO 585). The three graduate courses can be counted towards the requirements for Penn State's Computational Science Graduate Minor. Compared to astronomy departments at other universities, Penn State has a remarkably strong curriculum in methodology. 
     
  • CASt faculty authored a leading Astrostatistics textbook, Modern Statistical Methods for Astronomy with R Applications.
  • CASt faculty have organized and developed curriculum for numerous advanced summer schools in Astrostatistics, Bayesian Computing, and Astroinformatics that have trained over 2000 astronomers worldwide. They have also served as leaders for the Statistical Challenges in Modern Astronomy series of international research conferences.
     

Advanced Computing Milestones

  • CASt faculty contributed to a successful NSF proposal to enable Penn State to become a member of NSF’s Open Science Grid (OSG) Consortium, and providing Penn State students with enhanced access to OSG. 
     
  • Previously, CASt faculty helped lead a successful NSF proposal to establish a CyberLaboratory for Astronomy, Materials & Physics (CyberLAMP). The grant funded a large hybrid computer cluster, including NVIDIA P100 "Pascal" GPUs and "Knight's Landing" Intel Phi processors. The cluster was deployed at Penn State's new data center in early 2017 and recently retired, as the Institute for Computational & Data Science has acquired a new generation of GPUs. 
     

Leadership Milestones

Faculty Participants

Links