This repository provides an R package for Multi-view data analysis. Consider two types of high-dimensional measurements on the same samples. CBCE (Correlation Bi-Community Extraction method) finds a set of features A, from the first measurement type, and set of features B, from the second measurement type, such that features in A and B are correlated to each other in aggregate.

Formally the pair (A,B) is called a bimodule and the algorithm called the Bimodule Search Procedure (BSP) is introduced in [1]. We have used this method for analysis of multi-view data in areas like genomics and climate science.

Features of CBCE

  • RCpp implementation of the iterative testing framework; multicore if using ROpen.
  • Multiple backends to calculate p-values. It is also easy to use your own backend.
  • Code tested using testthat.
  • A simple GUI interface to monitor progress and terminate early.
  • Documented using Roxygen and pkgdown.

How to install CBCE

You can install the latest version of cbce directly from the github repo by first installing devtools.

if("devtools" %in% rownames(installed.packages()) == FALSE) {
  install.packages("devtools")
}
devtools::install_github("miheerdew/cbce")

Example usage

library(cbce)

#Sample size
n <- 40
#Dimension of measurement 1
dx <- 20
#Dimension of measurement 2
dy <- 50

#Correlation strength
rho <- 0.5

set.seed(1245)

# Assume first measurement is gaussian
X <- matrix(rnorm(dx*n), nrow=n, ncol=dx)

# Measurements 3:6 in set 2 are correlated to 4:7 in set 1
Y <- matrix(rnorm(dy*n), nrow=n, ncol=dy)
Y[, 3:6] <- sqrt(1-rho)*Y[, 3:6] + sqrt(rho)*rowSums(X[, 4:5])

res <- cbce(X, Y)

# Recovers the indices 4:5 for X and 3:6 for Y
# If the strength of the correlation was higher
# all the indices could be recovered.
res$comms

Documentation

More information is aviable on the software webpage.

Acknowledgement

This project has been funded by NIH R01 HG009125-01 grant.

References

[1] Dewaskar, Miheer, John Palowitch, Mark He, Michael I. Love, and Andrew Nobel. “Finding Stable Groups of Cross-Correlated Features in Multi-View data.” arXiv preprint arXiv:2009.05079 (2020).