Skip to main content
stat
Sequence Count Data Analysis: A perspective grounded in model identifiability (Cancelled)
Add to Calendar 2022-02-25T15:10:00 2022-02-25T16:00:00 UTC Sequence Count Data Analysis: A perspective grounded in model identifiability (Cancelled) 327 Thomas Building, University Park, PA
Start DateFri, Feb 25, 2022
10:10 AM
to
End DateFri, Feb 25, 2022
11:00 AM
Presented By
Justin Silverman
Event Series: SMAC Talks

Cancelled due to weather: https://www.psu.edu/news/story/university-park-person-classes-activities-and-campus-work-canceled-0

A common statistical problem is inference from positive-valued multivariate measurements where the scale (e.g., sum) of the measurements are not representative of the scale (e.g., total size) of the system being studied. This situation is common in the analysis of modern sequencing data. The field of Compositional Data Analysis (CoDA) axiomatically states that analyses must be invariant to scale. Yet, many scientific questions rely on the unmeasured system scale for identifiability. Instead, many existing tools make a wide variety of assumptions to identify models, often imputing the unmeasured scale. Here, we analyze the theoretical limits on inference given these data and formalize the assumptions required to provide principled scale reliant inference. Using statistical concepts such as consistency and calibration, we show that we can provide guidance on how to make scale reliant inference from these data. We prove that the Frequentist ideal is often unachievable and that existing methods can demonstrate bias and a breakdown of Type-I error control. We introduce scale simulation estimators and scale sensitivity analysis as a rigorous, flexible, and computationally efficient means of performing scale reliant inference.