Outside of homework assignments and exams, instructors can assess student learning in a variety of ways. One such method, called formative assessment, provides a low-stakes way to share feedback with students during the learning process and to gauge student understanding while there are still opportunities to address knowledge gaps.
“In statistics, we often provide short answer writing prompts, where students can interpret problems, consider multiple paths to a solution, and formulate answers in their own words,” said Matthew Beckman, associate research professor of statistics and director of the national organization CAUSE (Consortium for the Advancement of Undergraduate Statistics Education). “Especially for formative assessments, it is important that we provide students with timely feedback so they can improve their learning while they still remember the thought process behind what they wrote. But this can be particularly challenging in large-enrollment classrooms, which sometimes have two or three hundred students.”
Beckman, who has a background in statistics and in educational psychology, and his colleagues Dennis Pearl, research professor of statistics, and Rebecca Passonneau, professor of computer science and engineering, received a grant from the National Science Foundation in 2023 to develop, test, evaluate, and refine a new tool to help instructors provide timely and detailed feedback in large classrooms. The team is using techniques from natural language processing, a field of computer science and artificial intelligence that uses machine learning to allow computers to analyze, recognize, and generate text.
When grading written assignments, Beckman said that instructors inevitably start to see patterns in how the students respond. At some point, for efficiency and consistency, they may copy and paste the same feedback to students who have demonstrated the same level of learning.
“From an algorithmic perspective, there are two filters going on here that an instructor is doing in a manual human way, but that computers can be programmed to do pretty quickly,” he said. “The first is classification, an overall judgment of the quality of the response. Does the response demonstrate a complete understanding of the material, is it partially correct, or is it incorrect? The second part is to take those partially correct responses and cluster them into responses that show similar levels of understanding.”
Beckman and his team provide the tool with a particular question, the grading rubric, as well as sample answers that demonstrate different levels of understanding. The tool ultimately translates student responses into data and clusters similar types of responses. Human instructors then look at a few examples of each of the clusters, as well as anything in between, and decide what kind of feedback each of those responses should receive.
“The objective is to amplify what I would do as an engaged instructor so that I can provide timely, individualized feedback to as many students as possible; it’s never going to be left to run on its own,” Beckman said. “So far, the tools that we’ve been developing are not as consistent as experienced instructors, but they are at least as consistent as undergraduate student graders would be with one another.”
The team has pulled data from a wide range of universities to train and test their tools, but it has not yet been implemented. Beckman also stressed that that this tool is particularly promising for formative low-stakes assessments where timely feedback is essential, but that it would need a lot more refining for use in grading.
“We don’t want to unleash something in the student learning environment until we know that it would contribute meaningfully,” Beckman said. “We started with a focus on statistics education because that is the field I am familiar with and we knew that statistics instructors would be a receptive audience to piloting the tool, but we believe this could be a useful tool in all areas of education.”
Editor's Note: This story is part of a larger feature about artificial intelligence developed for the Winter 2026 issue of the Eberly College of Science Science Journal.