You are here:

CHPC - Research Computing Support for the University

In addition to deploying and operating high performance computational resources and providing advanced user support and training, CHPC serves as an expert team to broadly support the increasingly diverse research computing needs on campus. These needs include support for big data, big data movement, data analytics, security, virtual machines, Windows science application servers, protected environments for data mining and analysis of protected health information, and advanced networking. Visit our Getting Started page for more information.

Uncertainty Quantification of RNA-Seq Co-expression Networks

By Lance Pflieger and Julio Facelli, Department of Biomedical Informatics

Systems biology utilizes the complex and copious data originating from the “omics” fields to increase understanding of biology by studying interactions among biological entities. Gene co-expression network analysis is a systems biology technique derived from graph theory that uses RNA expression data to infer functional similar genes or regulatory pathways. Gene co-expression network analysis is a computationally intensive process that requires matrix operations on tens-of-thousands of genes/transcripts. This technique has been useful in drug discovery, functional annotation of a gene and insight into disease pathology.

To assess the effect of uncertainty inherent with gene expression data, our group utilized CHPC resources to characterize variation in gene expression estimates and simulate a large quantity of co-expression networks based on this variation. The figure shown is a representation of network generated using WGCNA and expression data from the disease Spinocerebellar Type 2 (SCA2). The colors represent highly connected subnetworks of genes which are used to correlate similar gene clusters with a phenotypic trait. Our results show that uncertainty has a large effect on downstream results including subnetwork structure, hub genes identification and enrichment analysis. For instance, we find that the number of subnetworks correlating with the SCA2 phenotype varies from 1 to 6 subnetworks. While a small gene co-expression network analysis can be performed using only modest computation resources, the scale of resources required to perform uncertainty quantification (UQ) using Monte Carlo ensemble methods is several orders of magnitude larger, which are only available at CHPC.

System Status

General Environment

last update: 2018-05-23 03:01:26
General Nodes
system cores % util.
ember 948/948 100%
kingspeak 748/880 85%
notchpeak 416/576 72.22%
lonepeak 1100/1100 100%
Owner/Restricted Nodes
system cores % util.
ash 5584/6224 89.72%
notchpeak 256/512 50%
ember 1052/1208 87.09%
kingspeak 6288/6672 94.24%
lonepeak 400/400 100%

Protected Environment

last update: 2018-05-23 03:00:42
General Nodes
system cores % util.
redwood 28/500 5.6%
Owner/Restricted Nodes
system cores % util.
redwood 252/1680 15%

Cluster Utilization

Last Updated: 5/18/18