Center for Public Health Genomics
University of Virginia
Modeling Gene Regulation with Public Genomic Data: From Integration to Prediction
Epigenetic regulation of gene expression plays a critical role in many biological processes including cancer formation and progression. Prediction of enhancers and transcription factors regulating genes with differential expression is an essential problem in functional genomics research. In this talk I will present a series of computational methods for modeling gene regulation using massive publicly-available data from human and mouse. We develop MARGE, a logistic regression and semi-supervised learning-based approach for predicting genomic cis-regulatory profiles that regulate a given gene set by leveraging a compendium of public H3K27ac ChIP-seq datasets. We develop BART to predict transcription factors associated with MARGE-predicted cis-regulatory profiles using thousands of public transcription factor ChIP-seq datasets. Integrating these approaches on The Cancer Genome Atlas (TCGA) molecular profiling data, we reconstruct the functional enhancer profiles and predict active transcription factor targets for each TCGA cancer type. Our work demonstrates the power of utilizing public data for computational studies of epigenomics.
One Capitol Square