Science Journal
science-journal

Managing data privacy with math and statistics

Statistics postdoc uses math background to explore algorithms for differential privacy
4 June 2025
Image
Postdoc Shurong Lin sits at a computer and looks at camera
Shurong Lin. Credit: Michelle Bixby

Many important insights can be gained through the study of complex datasets, for example from census and medical data, but these datasets often contain sensitive personal information. So, how can scientists use the data while ensuring that personal privacy isn’t compromised? Statisticians like postdoctoral researcher Shurong Lin study data privacy and develop algorithms that allow vital information to be mined from big datasets using techniques that mitigate privacy risks under a framework of statistical privacy.

Lin joined the Department of Statistics in the Penn State Eberly College of Science last September, working with Aleksandra “Seša” Slavković, the college’s associate dean for research, a professor of statistics and public health sciences, and Dorothy Foehr Huck and J. Loyd Huck Chair in Data Privacy and Confidentiality. Lin is just beginning the process of developing a few research projects on data privacy, with a focus on differential privacy.

Differential privacy almost sounds like you could understand it intuitively, but its mathematical definition is quite specific. It refers to using mathematical methods to process potentially sensitive data such that the information contained in the data—including, but not limited to, summary statistics like the data’s mean, variance, or range—can be shared and used by researchers while limiting the likelihood that any one individual's data can be inferred. The combination of computer science, theoretical math, and statistics required in the field of differential privacy was a perfect fit for Lin, whose undergraduate degree is in math.

“As a child, I always loved science, and math was my favorite subject at school,” she said. “I didn’t really know about statistics until it came up in some of my upper-level math courses at university. I was drawn in by the idea of being able to use my love of math to address real-world problems, instead of just working out theories on paper.”

After earning her undergraduate degree at Wuhan University in China, Lin worked briefly in industry as a data scientist, but she knew that she wanted to pursue her education further. She applied to doctoral programs in the US and found a good fit at Boston University.

At Boston University, Lin's research primarily focused on integrating differential privacy into various statistical tasks. In collaboration with her academic adviser, she worked with computer scientists specializing in privacy. Her dissertation, “Statistical Learning with Differential Privacy,” encompassed three projects and was supported by the US Census Bureau as part of its contribution to the initiative to deploy differential privacy for the 2020 Census.

“It’s about finding a balance where the statistical learning is still accurate enough for decision-making, while protecting sensitive information,” she said. “One typical way we do this is by adding noise to the data that is calibrated in such a way that general trends in the data are maintained without the ability to trace the information back to any specific participants who provided their data.”

While at Penn State, Lin continues to work on data privacy and aims to expand her research focus to related areas, contributing to the development of responsible and trustworthy data science.

“For a lot of the work I’ve done in the past, the goal was to be able to release sanitized summary statistics,” she said. “Now, I am beginning to work with Seša on the notion of being able to release synthetic datasets that are a sanitized version of the entire dataset. Hopefully, with these safe versions of datasets, we can begin to ask and answer more questions with privacy guarantees.”

As a postdoc, Lin’s time at Penn State is finite. Before long, she plans to be moving on to a faculty position. Like graduate students, postdocs are transient at a university, but they are the engines that drive our research endeavors, and their future output is the product of our investment.

“I come from a small town in China and chose to pursue my dream of being a scientist and am very thankful to all those who have helped me along my journey,” she said. “My older brother was my inspiration. I used to sneak into his room to find books about science. Thankfully, he wasn’t too worried about privacy!”