Science Journal Winter 2016 hero giraffe
science-journal

Tip of the Iceberg: Life as an Interdisciplinary Statistical Scientist

8 December 2016
Image
view of icebergs from above. credit annie spratt unsplash

What is happening to the West Antarctic ice sheet? This is a question Murali Haran may never have considered before he became a professor of statistics at Penn State, but it is now a question that consumes much of his time. Haran brings a blend of statistical modeling and computing methods to his teaching and research as a mem- ber of the statistics faculty in the Eberly College of Science.

“A lot of what I do is develop new statistical methods to address scientific questions,” said Haran. “Standard statistical methods rarely work when answering complex scientific questions, especially when this involves the analysis of complicated data sets.”

In modern scientific research, statistical modeling and computing have a central role to play. As a result, Haran is a valued research partner for a range of investigators in fields ranging from climate and atmospheric science to infectious disease dynamics.

Statistics and Science: Linked from the Beginning

Working with scientists has been a goal for Haran since he was a graduate student at the University of Minnesota. His thesis ended up focusing on spatial models, statistical methods for data with locations associated with them, and on algorithms called Markov chain Monte Carlo (MCMC). MCMC algorithms are commonly used to fit Bayesian models to data. His background in MCMC and spatial models put him in a good position to work with scientists once he became a faculty member.

“When I came to Penn State, my career began to flourish in terms of interdisciplinary work,” he said. “As a result, now people think of me as an interdisciplinary scientist.”

According to Haran, the field of statistics has always been intrinsically linked with other disciplines. He thinks that famous statistician John Tukey described it best: “The best thing about being a statistician is that you get to play in everyone’s backyard.”

Haran shares Tukey’s sentiment about this being the best part of his work. “I get to be curious,” he said. Also, the scientific problems lead to an endless supply of statistical challenges. “If you really get into the scientific questions, there is always a hard stats problem to be solved. A statistician will also often have some unique insights that can really help with thinking through the design and analysis for a scientific research problem.”

On-Campus Collaborations

Image
Murali Haran 2016

Many of the research projects Haran is part of involve collaborations with Penn State scientists. His first project working with a Penn State scientist began more than eight years ago when Klaus Keller, professor of geosciences in the College of Earth and Mineral Sciences, contacted him.

“He was pointed out to me as a ‘smart, nice, and amazingly productive scientist,’” said Keller.

Keller was looking for someone with in-depth expertise in Monte Carlo methods, Bayesian statistics, and spatial statistics to help him with his research on climate change. Luckily for Keller, Haran had expertise in all three of those areas and they were able to establish an enjoyable and mutually beneficial collaboration.

“We analyze how observations of the Earth system can be used to provide mechanistic insights about the climate system, quantify uncertainties, and help to inform decisionmaking,” said Keller.

Eight years later, Keller and Haran are still working together. They have coadvised doctoral students, and their collaboration has involved graduates and undergraduates, postdoctoral fellows, and other research faculty. “He is careful, insightful, and a great collaborator and mentor,” said Keller.

Much of their early work involved developing statistical methods for using high-dimensional observational data to tune or “calibrate” complex computer models used to study various features of the climate. Haran and Keller have begun collaborating on a new project on studying future storm surges. This research will help decision-makers better plan for storm surge events, particularly in terms of building appropriate infrastructure. 

According to Haran, “This is pretty important to assess, because if you can get a sense that in the future you expect to see more of these extreme events, you are more likely to need to build infrastructure to handle it.”

As a new assistant professor at Penn State, Haran was similarly contacted by Bryan Grenfell, a former Alumni Professor of Biology, to help with statistical modeling for infectious disease dynamics. Although Grenfell is no longer at Penn State, he introduced Haran to Distinguished Professor of Entomology and Biology Ottar Bjørnstad and Matthew Ferrari, an associate professor of biology at Penn State. Bjørnstad and Ferrari continue to collaborate with Haran on developing new statistical models for studying infectious disease dynamics.

In addition to Keller, Bjørnstad, and Ferrari, Haran is also conducting research with Dave Pollard, a senior scientist at Penn State’s Earth and Environmental Systems Institute and Chris Forest, an associate professor of climate dynamics, both based in the College of Earth and Mineral Sciences. 

Modeling the Future of Antarctic Ice Sheets

Image
Murali Haran with students in office

Haran’s work with Pollard and Forest involves studying changes to Antarctic ice sheets. Haran helps Pollard and Forest answer questions like “What is the future of the West Antarctic ice sheet?” and “How quickly is it melting?” using physical models in combination with statistical methods.

“As the ice sheet melts, it is likely to result in rising sea levels. Also, there are other ways in which the ice sheets affect climate. For instance, the ice sheet reflects sunlight, so if you get rid of it then less sunlight is reflected and we absorb even more heat. It’s a dangerous cycle called the ice-albedo positive feedback,” said Haran.

The team studies historical data about the ice sheet from the last 20,000 years.

“We are applying numerical models of continental ice sheets to Antarctica, simulating the changes from the last glacial maximum about 20,000 years ago to the present day. We perform large ensembles of several hundred runs with different parameter values, and Murali and his group use advanced statistical techniques to compare results with geologic data and improve the model,” said Pollard. 

Pollard and Forest, experts in climate and atmospheric science, can mathematically explain how the ice sheet behaves through this physics-based modeling. That, combine with satellite data, allows the team to create a model that mimics the past behavior of the ice sheet. The past behavior of the ice sheet, called a paleoreconstruction, is itself obtained from an analysis using sonar data and information from shallow sediment cores.

“You essentially kind of tune the model so that it behaves like the ice sheet has behaved in the past as well as the present,” Haran said. “Our challenge is to figure out how to capture all the information possible from paleoreconstructions, satellite data, and the ice sheet model, while carefully accounting for the many uncertainties in our observations and the ice sheet model.”

To do this, Haran has to work closely with the scientists who are experts in the subject.

Some of the statistical and computational challenges in these research projects comes from the size of the data. But Haran says that the focus in the media and popular science on the term “big data” overlooks the fact that many of the major challenges in statistics and science arise from the new types of data being collected, like satellite data, paleo-data, and ice sheet model output.

Pollard enjoys working with Haran on this research. “He has an enthusiasm to understand the underlying physics and motivation for the physical ice-sheet models we use. He also has expert knowledge of the relevant statistical fields, and a great ability to communicate them clearly to non-statisticians.”

Predicting the Spread of Infectious Disease

Haran has also been able to apply his statistics and computing skill sets to help predict the spread of infectious disease. Working with Ottar Bjørnstad and Matthew Ferrari of Penn State’s Center for Infectious Disease Dynamics, Haran has been able to delve into a world of science that differs greatly from the physics and atmospheric science behind climate.

“Infectious disease dynamics is a very different research area from climate science, but the beauty is that because I’m a statistician, I can work with infectious disease modelers on their problems as well,” he said.

The types of questions asked in this research range from “What types of patterns can we expect in disease transmission?” to “What kinds of insights can we get about the number of people infected, when they were infected, and where?” The team was also interested in learning if movement rates of infected people change in different seasons or periods of time, using school periods as an example. 

Ferrari first began working with Haran five years ago with a project on meningitis. “We started working together on spatial patterns of meningitis outbreaks in Nigeria,” said Ferrari. “We’re currently working on a project to estimate the burden of rotavirus in Niger.”

“They have ways to study how these diseases spread, and they have data on how the diseases spread, so I help them put these pieces together,” Haran said.

He and Bjørnstad have studied measles dynamics and collaborated on vaccine strategies, contributing to work for a Bill and Melinda Gates Foundation grant on the topic.

The insights obtained from asking questions about disease transmission can lead to greater understanding about how a particular disease spreads, according to Haran, and these insights can surprise you. As an example, geography plays a big role in disease transmission, but maybe not the way we traditionally thought it did.

“For measles transmission, big cities are essentially the most important factor in starting new outbreaks in smaller cities; infected individuals in adjacent small cities are perhaps less important,” said Haran. “This is an example of the kinds of things that would be hard to answer if you did not have a team of infectious disease modelers who also work with statisticians.”

Innovating Statistical Theory and Mentoring

For Haran, the interdisciplinary research projects also contribute new problems to his research on statistical methods and computational algorithms.

“The cool things about the applied research is that it generates more methodological questions for me to study,” he said. “The applied work drives new methodological work.”

Ferrari sees Haran’s skills in bridging the gap between statistics and science trickle down to Haran’s students: “He’s a great translator. There is the language of biological models and the language of computational statistics. Murali speaks both languages well. That’s great for getting projects started. But it is also crucial for guiding students through their work.”

This type of education in science communication is important for Haran to pass on to his students, so much so that he encourages his statistics graduate students to take classes with the students and faculty they are collaborating with to get a better understanding of the research project as a whole.

“It’s pretty important to me that they know what they are talking about, that they use the right terminology and language,” Haran said. “It is crucial that the students not only know about statistical methods but that they know how to talk to scientists. This is a skill that takes a long time and hard work to develop.” 

Haran enjoys seeing his students progress during their graduate education. “I enjoy seeing them learn to translate statistics and mathematical knowledge into something that solves scientific problems. It is also rewarding to see them develop the skills necessary to communicate with scientists throughout the research project.”

Haran says that it is easy to convince undergraduate and graduate students about the excitement of modern interdisciplinary statistics research.

“I tell them, as statisticians, we can team up with scientists to solve really difficult problems. Figuring out how to use available information to answer tough scientific questions requires expertise in statistical modeling and computing, and there is lots of room for creativity. We are unlikely to be out of work! And when we want to try something new, we can switch our area of focus because we can contribute to a huge number of interesting scientific disciplines.”

Never Bored

While Haran never knows what interesting research project will come his way, there’s one thing he knows for sure: he will continue to pursue as much collaborative research with scientists as his time allows.

“Working on scientific problems has clarified a lot of my own thinking in statistics,” he said. “It has affected my thinking about what statistics is and what it can do. I have a much deeper understanding for and appreciation of my field and its power.”

He hasn’t had trouble finding new research topics yet, and it’s unlikely that he will in the future. As scientific research progresses, the data sets are becoming richer, and the scientific questions are becoming more sophisticated. 

“There’s no shortage of problems,” said Haran. “It’s an embarrassment of riches.”