Date of Award

6-2025

Degree Name

MS in Computer Science

Department/Program

Computer Science

College

College of Engineering

Advisor

Borislav Hristov

Advisor Department

Computer Science

Advisor College

College of Engineering

Abstract

Current computational tools for analyzing chromatin organization are mainly focused on intrachromosomal interactions, despite growing evidence that suggests long-range interactions across chromosomes contribute to transcriptional regulation and disease development. This thesis aims to address this gap in interchromosomal genome analysis, presenting a robust computational pipeline that identifies a clique (i.e., a subgraph) of highly interacting trans-chromosomal regions anchored at a user-specified seed genomic locus. A weighted interaction network is constructed from an input Hi-C contact matrix, a widely used experimental assay for measuring genome-wide chromatin interactions. We model this input contact matrix as a graph and devise three different strategies to computationally find biologically important cliques: (1) a greedy heuristic for efficient local exploration, (2) a simulation-based random walk with restarts, and (3) an analytical formulation of the same random walk process. To validate the performance of this pipeline, we focus on TTN, a key muscle gene whose splicing is essential for human heart development. Hi-C data from wild-type and TTN promoter knockout cardiomyocytes are used to compare structural differences in TTN's long-range interactors. Though sparse contacts in the knockout data limit definitive comparison, cliques built from the wild-type matrix reveal loci with strong gene correlation. We further design several different background models to statistically assess the significance of these interactions. Our results highlight the effectiveness of network-based methods in uncovering functionally relevant interchromosomal interactions and lay the groundwork for future analyses.

Share

COinS