Alfred Schissler Abstracts

Alfred Schissler Abstracts

      Alfred Grant Schissler
      Ph.D. Candidate
      Statistics GIDP

     Joint Statistical Meetings
     Chicago, IL
     July 30th, 2016 - August 4th, 2016

Abstract

"Testing for differentially expressed pathways from within-­‐subject matched pairs of RNA-­‐seq data sets"

Existing approaches to personalized medicine rely on molecular data analyses across multiple patients. The path to precision medicine lies with molecular data analytics that can discover interpretable single-­‐subject signals. We previously developed a framework, N-­‐of-­‐1-­‐pathways, for single-­‐subject mRNA expression data analysis. N-­‐of-­‐1-­‐pathways  quantifies and tests the statistical significance of differential pathway (gene set) expression using a pair of samples derived from a single patient under two conditions (e.g., unaffected tissue vs. tumor tissue). Here, we study operating characteristics (empirical size and power) for pathway testing using statistical methods pertinent to the N-­‐of-­‐1-­‐pathways scenario. These include a basic, nonparametric, paired-­‐sample test (the Wilcoxon signed-­‐rank test), and also manipulation of the Mahalanobis distance of paired-­‐sample gene expression points from a null-­‐effect response line. We explore the methods for identifying differentially expressed pathways (DEPs) across various effect sizes, pathway sizes, and distributional assumptions on the mRNA expression. Lastly, we illustrate our approach with an application to cancer RNA-­‐seq  data  sets.

Abstract for Lay Audience

I am submitting a manuscript and presenting the work entitled "Testing for differentially expressed pathways from within-­‐subject matched pairs of RNA-­‐seq   data sets". This work focuses on developing statistical methods for testing gene expression dysregulation within patients. Gene expression from two tissues is measured via RNA-­‐sequencing and compiled into data sets for each patient. For example, gene expression is quantified from a lung tumor and also from unaffected lung tissue derived from the same patient. Our methods test for differences (dysregulation) between these samples to understand disease pathogenesis, prognosis, and treatment. We gain statistical power (the ability to make discoveries) and biological interpretation by grouping genes together into functional groups called pathways. Our goal is to provide methods to enable precision (personalized, individualized) medicine based on molecular information. In the course of the development of these methods, we perform large-­‐scale, elaborate simulation studies to understand highly variable and complicated RNA-­‐seq  data. For example, we  model correlation (relationships) between genes, which is largely ignored in pathway analysis. Our work represents the state-­‐of-­‐the-­‐art  in single-­‐subject ("N-­‐of-­‐ 1") genomic studies. This problem has not been explored deeply by statisticians as working with an "N-­‐of-­‐1" is problematic from a statistical standpoint (traditional statistics operate at the population level where more patients provides more power and information). We plan to develop specific software tools implemented in R (a widely-­‐used statistical computing environment) to enable researchers and clinicians the ability to perform N-­‐of-­‐1    studies.