Name: Accurate inference of DNA methylation data: statistical challenges lead to biological insights
Start: 2019-02-07T20:30:00
End: 2019-02-07T21:30:00

Submitted by rpc5102 on Tue, 08/06/2019 - 13:58

stat

Accurate inference of DNA methylation data: statistical challenges lead to biological insights

stat

Accurate inference of DNA methylation data: statistical challenges lead to biological insights

Presented By

Keegan Korthauer, Harvard T.H. Chan School of Public Health

Details

Start DateThu, Feb 07, 2019
3:30 PM

End DateThu, Feb 07, 2019
4:30 PM

Location

View larger map

Thomas Bldg

Add to Calendar 2019-02-07T20:30:00 2019-02-07T21:30:00 UTC Accurate inference of DNA methylation data: statistical challenges lead to biological insights Thomas Bldg

Start DateThu, Feb 07, 2019
3:30 PM

End DateThu, Feb 07, 2019
4:30 PM

Presented By

Keegan Korthauer, Harvard T.H. Chan School of Public Health

Event Series:

Record linkage (entity resolution or de-duplication) is the process of merging noisy databases to remove duplicate entities that often lack a unique identifier. Linking data from multiple databases increases both the size and scope of a dataset, enabling post-processing tasks such as linear regression or capture-recapture to be performed. Any inferential or predictive task performed after linkage can be considered as the "downstream task.” While recent advances have been made to improve flexibility and accuracy of record linkage, there are limitations in the downstream task due to the passage of errors through this two-step process. In this talk, I present a generalized framework for creating a representative dataset post-record linkage for the downstream task, called prototyping. Given the information about the representative records, I explore two downstream tasks—linear regression and binary classification via logistic regression. In addition, I discuss how error propagation occurs in both of these settings. I provide thorough empirical studies for the proposed methodology, and conclude with a discussion of practical insights into my work.

More about Andee: http://andeekaplan.com/