Load the library
library(CRMetrics) library(magrittr) library(dplyr)
There are two ways to initialize a new object of class
CRMetrics, either by providing
data.path is the path to a directory
containing sample-wise directories with the Cell Ranger count outputs.
Optionally, it can be a vector of multiple paths.
cms is a
(named, optional) list of (sparse, optional) count matrices.
Please note, if
data.path is not provided, some
functionality is lost, e.g. ambient RNA removal.
Optionally, metadata can be provided, either as a file or as a
data.frame. For a file, the separator can be set with the parameter
sep.meta (most often, either
, (comma) or
\t (tab) is used). In either format, the columns must be
named and one column must be named
sample and contain
sample names. In combination with
data.path, the sample
names must match the sample directory names. Unmatched directory names
cms is provided, it is recommended to add summary
crm <- CRMetrics$new(cms = cms, n.cores = 10, pal = grDevices::rainbow(8), theme = ggplot2::theme_bw()) crm$addSummaryFromCms()
Please note, some functionality depends on aggregation of sample and
cell IDs using the
sep.cell parameter. The default is
!! which creates cell names in the format of
<sampleID>!!<cellID>. If another separator is
used, this needs to be provided in relevant function calls.
Here, the folder with our test data is stored in
/data/ExtData/CRMetrics_testdata/ and we provide metadata
in a comma-separated file.
crm <- CRMetrics$new(data.path = "/data/ExtData/CRMetrics_testdata/", metadata = "/data/ExtData/CRMetrics_testdata/metadata.csv", sep.meta = ",", n.cores = 10, verbose = FALSE, pal = grDevices::rainbow(8), theme = ggplot2::theme_bw())
We can review our metadata
## sample age sex type RIN ## 1 SRR15054421 43 male RRMS medium ## 2 SRR15054422 57 male RRMS high ## 3 SRR15054423 52 male SPMS high ## 4 SRR15054424 66 female SPMS medium ## 5 SRR15054425 50 female SPMS high ## 6 SRR15054426 58 female RRMS high ## 7 SRR15054427 56 female SPMS low ## 8 SRR15054428 61 male SPMS high
Plot summary statistics
We can investigate which metrics are available and choose the ones we would like to plot
## no metrics ## 1 1 estimated number of cells ## 2 2 mean reads per cell ## 3 3 median genes per cell ## 4 4 number of reads ## 5 5 valid barcodes ## 6 6 sequencing saturation ## 7 7 q30 bases in barcode ## 8 8 q30 bases in rna read ## 9 9 q30 bases in umi ## 10 10 reads mapped to genome ## 11 11 reads mapped confidently to genome ## 12 12 reads mapped confidently to intergenic regions ## 13 13 reads mapped confidently to intronic regions ## 14 14 reads mapped confidently to exonic regions ## 15 15 reads mapped confidently to transcriptome ## 16 16 reads mapped antisense to gene ## 17 17 fraction reads in cells ## 18 18 total genes detected ## 19 19 median umi counts per cell
Samples per condition
First, we can plot the number of samples per condition. Here, we investigate how the distribution of the sex differs between the type of MS of the samples where RRMS is short for relapsing remitting MS, and SPMS is short for secondary progressive MS.
crm$plotSummaryMetrics(comp.group = "sex", metrics = "samples per group", second.comp.group = "type", plot.geom = "bar")
Metrics per sample
In one plot, we can illustrate selected metric summary stats. If no
comparison group is set, it defaults to
metrics.to.plot <- crm$selectMetrics(ids = c(1:4,6,18,19)) crm$plotSummaryMetrics(comp.group = "sample", metrics = metrics.to.plot, plot.geom = "bar")
Metrics per condition
We can do the same, but set the comparison group to
type. This will add statistics to the plots. Additionally,
we can add a second comparison group for coloring.
crm$plotSummaryMetrics(comp.group = "type", metrics = metrics.to.plot, plot.geom = "point", stat.test = "non-parametric", second.comp.group = "sex")