Skip to main content
The armillary sphere sculpture at Old Main on the University Park campus.

The Microbiome Center announces a free bioinformatics resource for Penn State

The Kyoto Encyclopedia of Genes and Genomes now available for Penn State faculty, students, and staff
29 March 2023
David Koslicki writing on whiteboard
David Koslicki (front) and the Penn State Microbiome Center have made the Kyoto Encyclopedia of Genes and Genomes freely available for Penn State faculty, students, and staff Credit: Kelby Hochreither/Penn State.

The Penn State Microbiome Center, in coordination with the lab of David Koslicki, associate professor of computer science and engineering and of biology and a co-hire of the Huck Institute of the Life Sciences at Penn State, is offering access to the Kyoto Encyclopedia of Genes and Genomes (KEGG) to all Penn State faculty, staff, and students for free. The database can be accessed by logging in with Penn State credentials at the Penn State Microbiome Center KEGG mirror site.

KEGG is a human-curated database used across all major fields in the life sciences to predict and understand plant, animal, and microbial metabolic pathways that drive the traits and biological impacts that we see at the level of the organism, population, or community. It is a comprehensive resource that aims to provide a global view of molecular interactions in cells between genes, proteins, and small molecules. It contains a myriad of biological information, including genomes, biological pathways, diseases, drugs, and chemical substances and is used in large-scale, big data analyses ranging from metagenomics to metabolomics, genomics, and transcriptomics.

“I was interested in developing a tool to do metagenomic functional profiling, where we extract DNA from a community of microorganisms and then identify what genes are present in the sample to understand what the community is capable of doing,” said Koslicki. “To do this, I needed a reference database with information on all the genes we know about, how they are related to each other, and what they do. KEGG was the natural one to think about because it’s one of the oldest and most stalwart bioinformatics repositories, dating back to 1995, and it has been used for numerous discoveries.”

The database also includes information on gene expression, protein-protein interactions, and small molecule-protein interactions. It is important for understanding the biosynthesis of toxins and antimicrobial compounds, nutrient cycling, and much more.

“However, KEGG is a subscription-based service and it can be difficult to get funding to support access to these types of repositories,” said Koslicki. “It turned out that an institutional subscription didn’t cost much more than an individual subscription, so I reached out to Seth Bordenstein, the director of the Microbiome Center and Dorothy Foehr Huck and J. Lloyd Huck Endowed Chair in Microbiome Sciences, and he was extremely supportive of the idea of providing this resource not only to other members of the Microbiome Center, but to the broader Penn State community, as well. It is fantastic that Seth was immediately on board with this and could see the value it could potentially add by making it available to so many researchers across Penn State with a relatively small investment.”

In addition to its database resources, KEGG also provides a number of computational tools for analyzing and interpreting biological data. These tools include the KEGG Mapper, which allows users to visualize and analyze pathways and networks; the KEGG Orthology (KO) database, which provides information on the evolutionary relationships between proteins; and the KEGG Automatic Annotation Server (KAAS), which annotates newly sequenced genomes.

“The Microbiome Center is a community that goes to the next level as a standard bearer in research, education, and collaboration on our microbial world,” said Bordenstein, who is also professor of biology and of entomology. “With David’s foresight in the computational microbiome sciences and the Center’s priority to resource innovation, we are now enabling free access to a pre-eminent, digital database for integrated genomics, chemistry, and functional information across life’s biological systems. We are looking forward to the impacts of the large-scale content in KEGG on Penn State’s discoveries, papers, and grants."