Training a machine learning model, according to Professor of Statistics and Mathematics Bharath Sriperumbudur, is a lot like teaching a class. An instructor provides practice questions and answers and then, during an exam, only gives questions. With supervised learning, a researcher first provides the model data with labels, or an input and an output, that the model trains on. Then, after training, researchers give the model unlabeled data and expect it to provide an output on its own. The quality of the model, or algorithm, is tied to both its speed and ability to produce answers with few errors.
“A lot of my research is spent finding algorithms that work efficiently while reducing the amount of error,” he said.
As the complexity of a model increases, so too do the number of statistical challenges and the possibility for errors. One way model complexity increase is with an increasing number of variables of interest, also called dimensions. Using employee data as an example, this could include an employee’s level of education, age, geographic region, gender, or salary. These five variables, however, could not be plotted on a two-dimensional graph with an x- and y-axis that were taught in grade school, and would instead require a 5-dimensional space. Reducing dimensions by ignoring some of the data or using statistical methods would result in a loss of information. Sriperumbudur is particularly interested in a type of machine learning that relies on a technique called the kernel method, which is one strategy to deal with high dimensions.
“The kernel method is a bit counterintuitive, because we actually map the data into a higher dimensional space,” he said. “But there, we can do the linear math that we know so well. It’s the same as doing more complicated, non-linear math in the original space. I can teach these methods to undergraduates because they understand linear mathematics. And this method gives you very powerful algorithms, and the power of the method depends on what sort of dimensional space you map it too.”
In this higher dimensional space, an algorithm essentially looks at the similarity of data points, and computing this similarity is called the kernel.
“The beauty of this method is that the data that you are going to map into higher dimensional space doesn’t have to be single points,” Sriperumbudur said. “It can be functions—like how someone’s blood pressure has changed over time—or it could be graphs, or it could even be texts. You can do clustering in this space, or regression, and you can even do traditional statistical hypothesis testing in this space. It makes challenging math much more straightforward and powerful.”
According to Sriperumbudur, machine learning methods based on the kernel method called support vector machines (SVM) have been very popular since the 1990s. Although the earliest deep learning models also existed at this time, deep learning was originally not as popular because the models were computationally intensive and could not support many layers in their artificial neural networks.
“SVMs and the kernel method continue to be powerful tools; they are just a different perspective than deep learning,” Sriperumbudur said. “Classical statistics and machine learning are different approaches to an analytical problem. What works best all depends on the data. I tell my students to start out with more simple algorithms to see if they work, and then to move on to more complex algorithms if complexity is needed.”
Editor's Note: This story is part of a larger feature about artificial intelligence developed for the Winter 2026 issue of the Eberly College of Science Science Journal.