Skip to main content
science-journal

Team R Rusty dominates the field at 2019 DataFest, a 48-hour number-crunching frenzy

6 November 2019

Penn State’s DataFest, an intense annual 48-hour number-crunching competition, saw a whopping 202 registrants in 2019. During the first weekend in April, teams of undergraduate and master’s degree students filled the classrooms in Thomas Building in the Eberly College of Science, ready to work through 3 million lines of code and find the best way to analyze the data.

Since its inaugural year in 2011 at the University of California, Los Angeles, the American Statistical Association’s DataFest has grown to include 41 institutions, including Penn State. Penn State’s DataFest began at University Park in 2013, pioneered by Stefanie Austin, instructor in statistics, and its leadership changed hands in 2015 to Kathy Smith, web manager for the statistics department.

Every year a different organization or company partners with DataFest, offering data to analyze. The Los Angeles Police Department provided arrest records for the first year of DataFest, while this year’s data were provided by the Canadian Women’s National Rugby Team.

Luwei Lei, Brian Mata, Yuxin Zhang, Michael Callahan Credit: Kathy Smith

Team R Rusty, consisting of Michael Callahan (mathematics), Yuxin Zhang (statistics), Luwei Lei (data sciences), and Brian Mata (computer science), took home Penn State’s 2019 title. They analyzed in-game performance data to evaluate point differentials between Canada and opposing teams as well as the relationship between athletes’ performances based on their preparation and training conditions prior to games. Team R Rusty ultimately found that the psychological state of athletes had a greater influence on performance than the measured physical metrics.

“Sleep and sleep quality emerged as highly influential metrics for athlete preparation since it was associated with improved psychological—as well as physiological—outcomes,” said Matthew Beckman, director of undergraduate programs in statistics. “Furthermore, the winning team described how to identify specific athletes for targeted attention to reduce stress, frustration, and other indicators that may adversely affect their performance on the field.”

While sleep plays a major factor in the performance of the Canadian rugby team, it was often something ignored by the dedicated DataFest participants.

“We’ve had students arrive first thing in the morning, and they don’t leave until after midnight, so even though the building is locked they’re still here working,” noted Statistics Programmer and Analyst Robert P. Carey III.

Carey joked that while the competitors were analyzing their assigned metrics, it was fitting that the statistics department surveyed the competitors themselves. Students from Penn State World Campus, Brandywine campus, and University Park all participated. Of the 12 colleges at University Park, three-quarters had students participating in DataFest, with the majority from the College of Engineering and the Eberly College of Science.   Are you interested in getting involved in the DataFest challenge? Undergraduate and master’s degree students at Penn State and surrounding institutions are encouraged to apply and can sign up in teams of three to five or be assigned to an existing team. Learn more about this free annual event at datafest.psu.edu or follow DataFest on Twitter @PSUdatafest to find out when registration for 2020 DataFest is live.