Main use
Load the library
library(CRMetrics)
library(magrittr)
library(dplyr)
There are two ways to initialize a new object of class
CRMetrics
, either by providing data.path
or
cms
. data.path
is the path to a directory
containing sample-wise directories with the Cell Ranger count outputs.
Optionally, it can be a vector of multiple paths. cms
is a
(named, optional) list of (sparse, optional) count matrices.
Please note, if data.path
is not provided, some
functionality is lost, e.g. ambient RNA removal.
Optionally, metadata can be provided, either as a file or as a
data.frame. For a file, the separator can be set with the parameter
sep.meta
(most often, either ,
(comma) or
\t
(tab) is used). In either format, the columns must be
named and one column must be named sample
and contain
sample names. In combination with data.path
, the sample
names must match the sample directory names. Unmatched directory names
are dropped.
If cms
is provided, it is recommended to add summary
metrices afterwards:
crm <- CRMetrics$new(cms = cms,
n.cores = 10,
pal = grDevices::rainbow(8),
theme = ggplot2::theme_bw())
crm$addSummaryFromCms()
Please note, some functionality depends on aggregation of sample and
cell IDs using the sep.cell
parameter. The default is
!!
which creates cell names in the format of
<sampleID>!!<cellID>
. If another separator is
used, this needs to be provided in relevant function calls.
Here, the folder with our test data is stored in
/data/ExtData/CRMetrics_testdata/
and we provide metadata
in a comma-separated file.
crm <- CRMetrics$new(data.path = "/data/ExtData/CRMetrics_testdata/",
metadata = "/data/ExtData/CRMetrics_testdata/metadata.csv",
sep.meta = ",",
n.cores = 10,
verbose = FALSE,
pal = grDevices::rainbow(8),
theme = ggplot2::theme_bw())
We can review our metadata
crm$metadata
## sample age sex type RIN
## 1 SRR15054421 43 male RRMS medium
## 2 SRR15054422 57 male RRMS high
## 3 SRR15054423 52 male SPMS high
## 4 SRR15054424 66 female SPMS medium
## 5 SRR15054425 50 female SPMS high
## 6 SRR15054426 58 female RRMS high
## 7 SRR15054427 56 female SPMS low
## 8 SRR15054428 61 male SPMS high
Plot summary statistics
We can investigate which metrics are available and choose the ones we would like to plot
crm$selectMetrics()
## no metrics
## 1 1 estimated number of cells
## 2 2 mean reads per cell
## 3 3 median genes per cell
## 4 4 number of reads
## 5 5 valid barcodes
## 6 6 sequencing saturation
## 7 7 q30 bases in barcode
## 8 8 q30 bases in rna read
## 9 9 q30 bases in umi
## 10 10 reads mapped to genome
## 11 11 reads mapped confidently to genome
## 12 12 reads mapped confidently to intergenic regions
## 13 13 reads mapped confidently to intronic regions
## 14 14 reads mapped confidently to exonic regions
## 15 15 reads mapped confidently to transcriptome
## 16 16 reads mapped antisense to gene
## 17 17 fraction reads in cells
## 18 18 total genes detected
## 19 19 median umi counts per cell
Samples per condition
First, we can plot the number of samples per condition. Here, we investigate how the distribution of the sex differs between the type of MS of the samples where RRMS is short for relapsing remitting MS, and SPMS is short for secondary progressive MS.
crm$plotSummaryMetrics(comp.group = "sex",
metrics = "samples per group",
second.comp.group = "type",
plot.geom = "bar")
Metrics per sample
In one plot, we can illustrate selected metric summary stats. If no
comparison group is set, it defaults to sample
.
metrics.to.plot <- crm$selectMetrics(ids = c(1:4,6,18,19))
crm$plotSummaryMetrics(comp.group = "sample",
metrics = metrics.to.plot,
plot.geom = "bar")
Metrics per condition
We can do the same, but set the comparison group to
type
. This will add statistics to the plots. Additionally,
we can add a second comparison group for coloring.
crm$plotSummaryMetrics(comp.group = "type",
metrics = metrics.to.plot,
plot.geom = "point",
stat.test = "non-parametric",
second.comp.group = "sex")