A team led by researchers at Penn State is working to accelerate drug discovery, with the potential to treat rare diseases, by improving the National Institutes of Health’s (NIH) Biomedical Data Translator (Translator) — a network of computer interfaces that take biomedical research questions and provide fact-based responses.

David Koslicki, associate professor of computer science and engineering and of biology, is leading the team, which includes a total of 24 researchers from Penn State; Oregon State University; the Institute of Systems Biology, a non-profit in Seattle; Broad Institute of MIT and Harvard; and the Université Grenoble Alpes in France. NIH’s National Center for Advancing Translational Sciences (NCATS) recently awarded them $2.57 million for fiscal year 2025, with an anticipated total of $12.8M over five years for the project.
“Translator is a clinician- or researcher-faced tool that can currently answer questions related to rare diseases or diseases with no known treatment, but we want Translator to be able to answer any and all biomedical questions,” said Koslicki, who is also affiliated with the Huck Institutes of the Life Sciences at Penn State. He explained that biomedical researchers or clinicians aiming to learn more about potential treatments for a patient can ask Translator for the latest information on a particular disease, learn more about certain genes and gene activity that could be causing illness, or search for drugs that might treat related diseases.
The project aims to expand the capabilities of the system and model that obtains information to produce an output or answer to researchers’ questions no matter how complex. By doing this, Koslicki’s team aims to lower barriers to entry into Translator’s system by improving transparency, organization and provenance of results, as well as to enhance the performance of the system for researchers.
“We want Translator to become what is akin to a question-and-answer agent similar to ChatGPT, but in contrast is actually based in fact where researchers or clinicians who input a question can trace their answers back to when the discovery was made and by whom, what research methods were used and more,” said Koslicki, who also chairs the Bioinformatics and Genomics Intercollege Graduate Degree Program at Penn State.
NCATS’ Cures Acceleration Network, which helps fund the new award, originally started the project in 2017, with the goal of developing a system that could aid researchers in accelerating translational research, such as discovering new relationships between drugs and treatments. Translator’s purpose is to accelerate the pace at which basic science and knowledge is translated into the clinic, including for rare diseases, Koslicki said.
Translator, which is strictly a research exploration tool, is powered by a network of interconnected computer systems that support public interfaces that answer researchers’ questions. Translator’s public user interface can answer questions such as “What chemicals can increase or decrease activities of gene X?” and “What gene’s activity may be increased/decreased by chemical Y?” and “What drugs treat conditions related to disease A/phenotype B?”
Translator’s actual programmatic/computer interface, which is also public, can answer almost any biomedical research questions, but takes specialized knowledge to use. According to Koslicki, many computer scientists and programmers use this interface to ask their questions.
The output or fact-based answer on any of the interfaces includes information like symptoms, drugs, existing medical devices, biological pathways and relationships, current treatments that have the potential to treat diseases they weren’t originally intended to treat and scientific literature explaining Translator’s answers.
Koslicki’s team is incorporating and streamlining more data sources into the knowledge bases; improving standards and how users engage with Translator; and helping expand the public user interface’s system to answer arbitrary questions from the larger Translator network’s databases. The team also plans to gather feedback from clinicians and researchers that they will use to guide improving the performance of and accessibility to Translator’s system.
“We are expanding and improving the capabilities of this space, which is geared towards biomedical translational research, to do things like explain how two biomedical concepts might be mechanistically related,” Koslicki said. “How do we translate science into something that will actually help the patients, and do it quickly?”
Researchers in academia and in industry are still making use of Translator. For example, Koslicki pointed to Every Cure, a non-profit startup specifically designed to apply Translator to advance drug repurposing, move those discoveries into clinical trials and, eventually, patient care. Clinicians at the University of Alabama at Birmingham’s Hugh Kaul Precision Medicine Institute also use Translator to identify potential conditions or possible drug repurposing options.
Koslicki’s team also aims to streamline current and new scientific discoveries and knowledge across diverse fields into one large database, standardizing them not only to increase interdisciplinary collaboration, but also to provide to Translator the most up-to-date science to aid in answering clinicians’ questions via the interface.
“Data and knowledge continue to be produced at breakneck speeds,” he said. “There’s always new discoveries and new studies happening. We want to continue to bring in more data and incorporate every publicly accessible bit of biomedical knowledge and research article into our system.”
The researchers at Penn State, based in Koslicki’s lab, specialize in developing the machine learning and artificial intelligence approaches to understand and answer questions put into Translator. Other team members are focusing on updating Translator with more knowledge, like information about drug treatments and interactions and ensuring that it is ready for use.
Additional teams are working on the project, focusing on automating infrastructure, improving the user interface and leveraging scientific literature to explain drugs, diseases or symptoms and potential relationships.
“This has been one of the most exciting projects I have been involved with because it is already realizing this kind of broader impact,” Koslicki said. “We want to make science more rapid to actually help and improve people’s lives and we are building the infrastructure to make that possible.”
Christina Grozinger, director of the Huck Institutes of the Life Sciences, echoed Koslicki's enthusiasm.
"Our goal at the Huck Institute of the Life Sciences is to integrate science across scales and disciplines, to accelerate scientific discoveries and channel research into solutions,” said Grozinger, who is also the Publius Vergilius Maro Professor of Entomology in the Penn State College of Agricultural Sciences. “The Translator program is a powerful example of this, where AI is being leveraged to synthesize these massive biomedical data sets to help both researchers and clinicians access knowledge and develop new tools to treat diseases. This type of research is only possible through collaborations that bring together teams of researchers and practitioners."
In addition to Koslicki, Penn State contributors to this phase of the project include Stephanie Won, graduate student in biology; Mohsen Taheri, Kevin Vizhalil, Venkata Matta and Tejasree Parasa, all of whom are graduate students in computer science; Chunyu Ma and Max Lupey, postdoctoral scholars in computer science; and Shannon Ranio, Strategic Interdisciplinary Research Office program manager.