
Carnegie Mellon University (BS ’15)
MIT, Computational & Systems Biology (PhD ’25)
Welcome to my personal website! I’m a computational biologist focused on applying statistical methods and machine learning to genomics data to uncover biological insights in human development and disease.
Featured Work

Single-cell phylogenies for the study of cancer drug persistence potential
Drug-tolerant persister (DTP) cells survive therapy without classic resistance mutations. We investigated the intrinsic persistence potential that enables some drug-naïve cells survive first-line treatment. Using a CRISPR high-resolution lineage-tracing assay paired with single-cell RNA-seq in EGFR-mutant (PC9) lung cancer, we found that the persistence trait is heritable across clades of the lineage tree. We derived per-cell persistence-potential scores and linked them to gene-expression programs, revealing a coordinated balance between oxidative phosphorylation, protein synthesis, and proteasomal degradation that associates with persistence potential—and mirrors signatures of poor prognosis and post-EGFRi progression in patients. This study presents phylogenetic strategies for studying phenotypic potential, enabling the identification of novel therapeutic targets that can be missed by less granular lineage tracing assays that study the effects post intervention. (manuscript in prep)

Somatic Mutation Denoising from full-length Single-Cell RNA-Sequencing Reveals known Cancer Associated Mutational Signatures and Clonal Markers
Single-cell genomic technologies enable the identification and molecular characterization of clonal expansion in both normal and disease contexts. Robust profiling of somatic mutations in single cells is essential for characterization of evolution in diseases such as cancer that are riddled with mutations. That being said, somatic mutation profiling remains an underexplored area of single-cell genomics due to various technical challenges. Full length scRNA-seq assays, such as Smart-seq2, are well suited for this task in comparison to other single-cell genomic assays, due to more even transcript read coverage and transcription-mediated amplification of the mutation loci. This data, however, still contains several technical artifacts that need to be filtered such as RNA-editing events, and recurrent sequencing error-prone loci. We employ a number of statistical and unsupervised machine learning (anomaly detection) filters to remove these technical artifacts. This method enables the de-novo detection of somatic mutations in single cells using full length scRNA-seq. Many of the mutations are validated using expected cancer mutational signatures as well as with matched WES data. We also demonstrate that mutations detected in scRNA-seq align with identified CNV states, as well as known clonal states in HSCs. Furthermore, implementing a novel single-cell mutation variational autoencoder, we identified de-novo clusters of cells with shared mutations. Lastly, we demonstrate that we can study the relationship between different cell states in recurrent tumors and their exposure to certain mutation processes, using signatures detected in single cells. (manuscript in prep)

The dynamics of hematopoiesis over the human lifespan
Over a lifetime, hematopoietic stem cells (HSCs) adjust their lineage output to support age-aligned physiology. In model organisms, stereotypic waves of hematopoiesis have been observed corresponding to defined age-biased HSC hallmarks. However, how the properties of hematopoietic stem and progenitor cells change over the human lifespan remains unclear. To address this gap, we profiled individual transcriptome states of human hematopoietic stem and progenitor cells spanning gestation, maturation and aging. Here we define the gene expression networks dictating age-specific differentiation of HSCs and the dynamics of fate decisions and lineage priming throughout life. We additionally identify and functionally validate a fetal-specific HSC state with robust engraftment and multilineage capacity. Furthermore, we observe that classification of acute myeloid leukemia against defined transcriptional age states demonstrates that utilization of early life transcriptional programs associates with poor prognosis. Overall, we provide a disease-relevant framework for heterochronic orientation of stem cell ontogeny along the real time axis of the human lifespan.
https://www.nature.com/articles/s41592-024-02495-0