Comparison is done across columns, i.e., how similar are the columns in the two dataset. For gene expression data, format data so that gene names are in rows and samples in columns.

sracipeHeatmapSimilarity(
  dataReference,
  dataSimulation,
  clusterCut = NULL,
  nClusters = 3,
  pValue = 0.05,
  permutedVar,
  permutations = 1000,
  corMethod = "spearman",
  clusterMethod = "ward.D2",
  method = "pvalue",
  buffer = 0.001,
  permutMethod = "simulation",
  returnData = FALSE
)

Arguments

dataReference

Matrix. The reference data matrix, for example, the experimental gene expression values

dataSimulation

Matrix. The data matrix to be compared.

clusterCut

(optional) Integer vector. Clsuter numbers assigned to reference data. If clusterCut is missing, hierarchical clustering using /codeward.D2 and /codedistance = (1-cor(x, method = "spear"))/2 will be used to cluster the reference data.

nClusters

(optional) Integer. The number of clusters in which the reference data should be clustered for comparison. Not needed if clusterCut is provided.

pValue

(optional) Numeric. p-value to consider two gene expression sets as belonging to same cluster. Ward's method with spearman correlation is used to determine if a model belongs to a specific cluster.

permutedVar

(optional) Similarity scores computed after permutations.

permutations

(optional) Integer. Default 1000. Number of gene permutations to generate the null distibution.

corMethod

(optional) Correlation method. Default method is "spearman". For single cell data, use "kendall"

clusterMethod

(optional) Character - default ward.D2, other options include complete. Clustering method to be used to cluster the experimental data. hclust for other options.

method

(optional) character. Method to compare the gene expressions. Default pvalue. One can use variance as well which assigns clusters based on the cluster whose samples have minimum variance with the simulated sample.

buffer

(optional) Numeric. Default 0.001. The fraction of models to be assigned to clusters to which no samples could be assigned. For example, a minimum of 1 ghost sample in reference is assigned to NULL cluster.

permutMethod

"sample" or "reference"

returnData

(optional) Logical. Default FALSE. Whether to return the sorted and clustered data.

Value

A list containing the KL distance of new cluster distribution from reference data and the probability of each cluster in the reference and simulated data.

sracipeSimulate, sracipeKnockDown, sracipeOverExp, sracipePlotData, sracipeHeatmapSimilarity