3:30 PM
4:30 PM
Abstract: With the increasing availability of electronic health records (EHR) data, it is important to effectively integrate evidence from multiple data sources to enable reproducible scientific discovery. However, we are still facing practical challenges in data integration, such as protection of data privacy, the high dimensionality of features, and heterogeneity across different datasets. Aim to facilitate efficient multi-institutional data analysis without sharing individual patient data (IPD), we developed a toolbox of Privacy-preserving Distributed Algorithms (PDA) that conduct distributed learning and inference for various models, such as association analyses, causal inference, cluster analyses, counterfactual analyses, and beyond. Our algorithms do not require iterative communication across sites and are able to account for heterogeneity across different hospitals. The validity and efficiency of PDA are also demonstrated with real-world use cases in Observational Health Data Sciences and Informatics (OHDSI), PCORnets including PEDSnet and OneFlorida, and Penn Medicine Biobank (PMBB).
Speaker Bio: Yong Chen is tenured Professor of Biostatistics and the Founding Director of the Center for Health Analytics and Synthesis of Evidence (CHASE) at the University of Pennsylvania. He is an elected fellow of American Statistical Association, International Statistical Institute, Society for Research Synthesis Methodology, American College of Medical Informatics, and American Medical Informatics Association. He founded the Penn Computing, Inference and Learning (PennCIL) lab at the University of Pennsylvania, focusing on clinical evidence generation and evidence synthesis using clinical and real-world data. During pandemic, Dr. Chen is serving as biostatistics core director for a national multi-center study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 9 million pediatric patients across 40 health systems.