Skip to content

Katsevich-Lab/spacrt-manuscript

Repository files navigation

Computationally efficient and statistically accurate conditional independence testing with spaCRT

This repository reproduces the results reported in arXiv version 2 the following paper:

Z. Niu, J. Ray Choudhury, E. Katsevich. “Computationally efficient and statistically accurate conditional independence testing with spaCRT.” (arXiv)

Get started

First, clone the spacrt-manuscript repository onto your machine.

git clone [email protected]:Katsevich-Lab/spacrt-manuscript.git

One can choose to either run the simulation or real data analysis and obtain the figures, or directly download the results from Dropbox and use our plotting code to reproduce the figures. We will present these two routes separately.

Download results data and create the figures

The data are stored in .rds format. Download the simulation results and real data results from: Dropbox simulation results repository and Dropbox real data results repository, respectively. The following command could be used for reproducing the plots for simulation and real data analysis respectively.

Create the figures for real data analysis

One needs to change the data_dir in realdata-code/plotting-code.R to the right directory where the downloaded results are. The value for max_cutoff should be chosen to 100.

Rscript realdata-code/plotting-code.R $max_cutoff

Create the figures for simulation results

One could use the following code for reproducing the plots for simulation results. Note the path_rds variable in these Rscripts should be the path to the downloaded simulation results.

Rscript -e 'source("simulation-code/plotting-code/assemble-plots-NB-disp-5e-2.R")'
Rscript -e 'source("simulation-code/plotting-code/assemble-plots-NB-disp-1.R")'
Rscript -e 'source("simulation-code/plotting-code/assemble-plots-NB-disp-10.R")'

If you would like to rerun the simulations from scratch, do not download the results and instead follow the steps in the next section.

Reproduce the results and figures for simulation and real data analysis

One needs to first download the spacrt package from Katsevich-lab using the following R code.

library(devtools)
install_github("katsevich-lab/spacrt")

We used a config file to increase the portability of our code across machines. Create a config file called .research_config in your home directory.

cd
touch ~/.research_config

Define the following variable within this file:

  • LOCAL_SPACRT_DATA_DIR: the location of the directory in which to store results.

The contents of the .research_config file should look like something along the following lines.

LOCAL_INTERNAL_DATA_DIR="/Users/ziangniu/Documents/Projects/HPCC/data/projects/"
LOCAL_SPACRT_DATA_DIR=$LOCAL_INTERNAL_DATA_DIR"spacrt/"

Navigate to the spacrt-manuscript directory. All scripts below must be executed from this directory. Figures will be automatically created if one uses the following code to reproduce the results.

Run simulation and create figures

Also, for the commands below, depending on the limits of your cluster, you may need to set the max_gb and max_hours parameters differently. The choice in run_all_simulation.sh is 16 and 4, respectively.

qsub run_all_simulation.sh

Run real data analysis and create figures

One can use the following command to reproduce the real data analysis results.

qsub run_all_realdata.sh

Table 3 in the paper can be created using realdata-code/sparsity_dataset.R.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published