Skip to content

SCSR_UsersGuide

Shao, Xin edited this page Jul 14, 2020 · 4 revisions

Key function 1: cell_signaling()

Description

Computes “autocrine” or “paracrine” interactions between cell clusters.

Usage

cell_signaling(data,
               genes,
               cluster,
               int.type = c("paracrine",  "autocrine"),
               c.names = NULL,
               s.score = 0.5,
               logFC = NULL,
               species = c("homo sapiens", "mus musculus"),
               tol = 0,
               write = TRUE,
               verbose = TRUE)

Arguments

data a data frame of n rows (genes) and m columns (cells) of read or UMI counts (note : rownames(data)=genes)

genes a character vector of HUGO official gene symbols of length n

cluster a numeric vector of length m

int.type “autocrine” or “paracrine”

c.names (optional) cluster names

s.score a number between 0 and 1, the LRscore threshold

logFC a number, the log fold-change threshold for differentially expressed genes

species “homo sapiens” or “mus musculus”

tol a tolerance parameter for balancing between “autocrine” and “paracrine” interactions

write a logical

verbose a logical

Details

int.type must be equal to “paracrine” or “autocrine” exclusively. The “paracrine” option looks for ligands expressed in cluster A and their associated receptors according to LR_db_ that are expressed in any other cluster but A. These interactions are labelled “paracrine”. The interactions that involve a ligand and a receptor, both differentially expressed in their respective cell clusters according to the edgeR analysis performed by the cluster_analysis() function, are labelled “specific”.

The “autocrine” option searches for ligands expressed in cell cluster A and their associated receptors also expressed in A. These interactions are labelled “autocrine”. Additionally, it searches for those associated receptors in the other cell clusters (not A) to cover the part of the signaling that is “autocrine” and “paracrine” simultaneously. These interactions are labelled “autocrine/paracrine”.

The tol argument allows the user to tolerate a fraction of the cells in cluster A to express the receptors in case int.type="paracrine", that is to call paracrine interactions that are dominantly paracrine though not exclusively. Conversely, it allows the user to reject interactions involving receptors that would be expressed by a small fraction of cluster A cells in case int.type="autocrine". By construction the association of these two options covers all the possible interactions and increasing the tol argument allows the user to move interactions from “autocrine” to “paracrine”.

If the user does not set c.names, the clusters will be named from 1 to the maximum number of clusters (cluster 1, cluster 2, …). The user can exploit the c.names vector in the list returned by the cell_classifier() function for this purpose. The user can also provide her own cluster names.

s.score is the threshold on the LRscore. The value must lie in the [0;1] interval, default is 0.4 to ensure confident ligand-receptor pair identifications (see our publication). Lower values increase the number of putative interactions while increasing the false positives. Higher values do the opposite.

logFC is a threshold applied to the log fold-change (logFC) computed for each gene during the differential gene expression analysis. Its default value is log2(1.5) It further selects the differentially expressed genes (>logFC) after the p-value threshold imposed in the function cluster_analysis() below.

species must be equal to “homo sapiens” or “mus musculus”, default is “homo sapiens”. In the case of mouse data, the function converts mouse genes in human orthologs (according to Ensembl) such that LR_db_ can be exploited, and finally output genes are converted back to mouse.

If write is TRUE, then the function writes a text file that reports the interactions in the cell-signaling folder. This file is a 4-column table: ligands, receptors, interaction types (“paracrine”, “autocrine”, “autocrine/paracrine” and “specific”), and the associated LRscore.

Remarks: this function can be used with any data table associated with corresponding genes and cluster vectors, meaning that advanced users can perform their own data normalization and cell clustering upfront. In case, the function cluster_analysis() was not executed, this function would work but “specific” interactions would not be annotated as such.

Value

The function returns “autocrine” or “paracrine” interaction lists.

Key function 2:visualize()

Description

Creates chord diagrams from the interactions tables.

Usage

visualize(inter,
          show.in = NULL,
          write.in = NULL,
          write.out = FALSE,
          method = "default",
          limit = 30)

Arguments

inter a list of data frames result of the cell_signaling() function

show.in a vector of which elements of inter must be shown

write.in a vector of which elements of inter must be written

write.out a logical

method a string (usually relative to the experiment)

limit a value between 1 and number of interactions

Details

show.in gives the elements of inter to be displayed in the plot window.

write.in gives the elements of inter to be written as pdf files in the images folder.

If write.out is TRUE, then the function writes a pdf file with a summary of the all the interactions of inter as a chord diagram.

limit is the maximum number of interactions displayed on one chord diagram. Raising this limit over 30 may decrease the visibility.

Value

The function returns images in the plot window of Rstudio and images in the pdf format in the images folder.