Robust gene coexpression networks using signed distance correlation.
Pardo-Diaz J., Bozhilova LV., Beguerisse-Díaz M., Poole PS., Deane CM., Reinert G.
MOTIVATION: Even within well studied organisms, many genes lack useful functional annotations. One way to generate such functional information is to infer biological relationships between genes/proteins, using a network of gene coexpression data that includes functional annotations. However, the lack of trustworthy functional annotations can impede the validation of such networks. Hence, there is a need for a principled method to construct gene coexpression networks that capture biological information and are structurally stable even in the absence of functional information. RESULTS: We introduce the concept of signed distance correlation as a measure of dependency between two variables, and apply it to generate gene coexpression networks. Distance correlation offers a more intuitive approach to network construction than commonly used methods such as Pearson correlation and mutual information. We propose a framework to generate self-consistent networks using signed distance correlation purely from gene expression data, with no additional information. We analyse data from three different organisms to illustrate how networks generated with our method are more stable and capture more biological information compared to networks obtained from Pearson correlation or mutual information. SUPPLEMENTARY INFORMATION: Supplementary Information and code are available at Bioinformatics and https://github.com/javier-pardodiaz/sdcorGCN online.