Introduction

We discuss the design of sRACIPE in this article. As our primary aim is to develop a robust framework for simulating gene expression profiles that can capture the experimental observations, we utilized the well established data structures from BioConductor to store the modeling results and parameters. We extend the SummarizedExperiment S4 class to create a new class RacipeSE to store and access the circuit/network, simulated gene expressions, parameters, intial conditions and other meta information. As for any S4 class objects, the slots should not be accessed directly and the getter and accessor functions should be used instead.

Configuration

When a RacipeSE object is constructed, a SummarizedExperiment object is constructed and the default configuration is added as metadata. The default configuration is stored as a configData data object. These configuration parameters can also be passed as argumnets to sracipeSimulate function. The getter/accessor function for configuration is sracipeConfig.

suppressWarnings(suppressPackageStartupMessages(library(sRACIPE)))

racipe <- RacipeSE() # Construct an empty RacipeSE object
racipe  # View the object. Notice the config in metadata. 
## class: RacipeSE 
## dim: 0 0 
## metadata(1): config
## assays(0):
## rownames: NULL
## rowData names(0):
## colnames: NULL
## colData names(0):
sracipeConfig(racipe) # access the config
## $simParams
##         numModels    simulationTime integrateStepSize        printStart 
##              0.00             50.00              0.02             50.00 
##               nIC   outputPrecision       rkTolerance        paramRange 
##              1.00             12.00              0.01            100.00 
##     printInterval 
##             10.00 
## 
## $stochParams
##             nNoise noiseScalingFactor       initialNoise        scaledNoise 
##                0.0                0.5               50.0                0.0 
##          shotNoise 
##                0.0 
## 
## $hyperParams
##      prodRateMin      prodRateMax       degRateMin       degRateMax 
##           1.0000         100.0000           0.1000           1.0000 
##    foldChangeMin    foldChangeMax      hillCoefMin      hillCoefMax 
##           1.0000         100.0000           1.0000           6.0000 
## interactionTypes  thresholdModels         sdFactor 
##           3.0000        5000.0000           0.5658 
## 
## $options
##      anneal scaledNoise       genIC   genParams   integrate 
##       FALSE       FALSE        TRUE        TRUE        TRUE

The config list includes other lists or vectors like simParams, stochParams, hyperParams, options, thresholds etc. The list simParams contains values for parameters like the number of models (numModels), simulation time (simulationTime), step size for simulations (integrateStepSize), when to start recording the gene expressions (printStart), time interval between recordings (printInterval), number of initial conditions (nIC), output precision (outputPrecision), tolerance for adaptive runge kutta method (rkTolerance), parametric variation (paramRange). The list stochParams contains the parameters for stochastic simulations like the number of noise levels to be simulated (nNoise), the ratio of subsequent noise levels (noiseScalingFactor), maximum noise (initialNoise), whether to use same noise for all genes or to scale it as per the median expression of the genes (scaledNoise), ratio of shot noise to additive noise (shotNoise). The list hyperParams contains the parameters like the minimum and maximum production and degration of the genes, fold change, hill coefficient etc. The list options includes logical values like annealing (anneal), scaling of noise (scaledNoise), generation of new initial conditions (genIC), parameters (genParams) and whether to integrate or not (integrate).

Metadata information also includes annotation, nInteraction (number of interactions in the circuit), normalized (whether the data is normalized or not), data analysis lists like pca, umap, cluster assignment of the models etc. These are added to the object in subsequent steps in the pipeline.

Circuit

SummarizedExperiment slot rowData is used to store the circuit topology. It is a square matrix with dimension equal to the number of genes in the circuit. The values of the matrix represent the type of interaction in the gene pair given by row and column. 1 represents activation, 2 inhibition and 0 for no interaction. The sracipeCircuit getter and accessor should be used to access this information. The annotation getter and setter set/get the name. Setting the circuit also adds nInteractions and annotation to the RacipeSE object.

data("demoCircuit")
sracipeCircuit(racipe) <- demoCircuit
## circuit file successfully loaded
annotation(racipe) <- "demoCircuit"
annotation(racipe)
## [1] "demoCircuit"

Parameters and initial conditions

SummarizedExperiment slot colData contains the parameters and initial conditions for each model. Each gene in the circuit has two parameters, namely, its production rate and its degradation rate. Each interaction has three parameters, namely, threshold of activation, the hill coefficient, and the fold change. Each gene has one or more initial gene expression values as specified by nIC. The corresponding getter and setter are sracipeParams and sracipeIC.

Simulated gene expression

As such, sRACIPE can be used to simulate the trajectory for a single model with specific kinetic parameters or a large large number of models with randomized parameters. These sRACIPE models can be considered as separate samples/cells differing from each other in terms of the parameters reflecting the cellular heterogeneity. The SummarizedExperiment slot assays is used for storing simulated gene expressions. The rows of these matrix-like elements correspond to various genes in the circuit and columns correspond to models. The first element of the List is used for unperturbed deterministic simulations. The subsequent elements are used for stochastic simulations at different noise levels and/or knockout simulations. The name for each assay contains the information on the noise level (for stochastic simulations), knocked out gene (knockout simulations), or time (temporal simulations). The assay or assays accessor and getters from the SummarizedExperiment class should be used to access or modify these slots.

Time Series

If one is interested in trajectories of a single model, then sRACIPE provides an argument timeSeries in the sracipeSimulate funcion. If enabled, sRACIPE will generate the trajectory for a single model and the corresponding plotting functions will also show time series plots. In such case, the trajectories are stored in metadata instead of assays. The use of sracipeGetTS function is recommended in this case.

Usually, sRACIPE only records the steady states or the simulated gene expression at the end of the simulation. But the model trajectories can also be recorded and analyzed using sRACIPE using the variables printInterval and printStart. These variables define how often the simulated gene expressions are recorded. Thus, one can record multiple time points during the model simulations. If intermediate time points are also recorded, then the first element will contain the last (steady-state) solutions and the other slots will contain the values at the intermediate time points. For more details on temporal simulations of large number of models, please refer to the corresponding article.