A statistical, reference-free algorithm subsumes diverse problems in genome science

14 October 2022

Julia Salzman
Departments of Biomedical Data Science, Biochemistry and Statistics
Stanford University

zoom recording

Abstract

I will discuss a unifying biological and statistical formulation for many fundamental problems in genome science. This formulation allows us to construct an algorithm that performs inference on raw reads, avoiding references completely. The talk will focus on the biological and probabilistic problem formulation and the statistical methodology that we have developed to solve it, using the "chalk-talk" style of presentation. I will then discuss the power of this approach for new data-driven biological discovery with examples of novel single-cell resolved, cell-type-specific isoform expression, including splicing, expression in the major histocompatibility complex, and de novo prediction of viral protein adaptation including in SARS-CoV-2.

References

  1. K Chuang, T Z Bharav, I N Zheludev, J Salzman, "A statistical reference-free genomic algorithm subsumes common workflows and enables novel discovery", bioRxiv 2022.06.24.497555, 2022

current theory lunch schedule