单细胞测序分析之scater包(v1.18.6)学习(2/2)

作者: 北欧森林 | 来源:发表于2021-03-18 07:20 被阅读0次

单细胞测序分析之scater包(v1.18.6)学习(2/2)
单细胞测序分析之scater包(v1.18.6)学习(1/2)
使用scater包进行单细胞测序分析（一）：数据导入与Singl
单细胞测序分析之Monocle2包学习笔记
你以为细胞聚在一起就是一类细胞吗
单细胞测序分析R包Seurat质量控制小提琴图QC VlnPlo
使用scater包进行单细胞测序分析（三）：数据降维与可视化
跟着Cell学单细胞转录组分析(二):单细胞转录组测序文件的读入
单细胞转录组学习笔记-15-利用scRNAseq包学习scate
单细胞测序技术

scater包官网：Single-Cell Analysis Toolkit for Gene Expression Data in R.
目前版本为1.18.6 (Revised: February 4, 2020).

4 Dimensionality reduction

4.1 Principal components analysis(PCA)

example_sce <- runPCA(example_sce)
str(reducedDim(example_sce, "PCA"))

##  num [1:3005, 1:50] -15.4 -15 -17.2 -16.9 -18.4 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : chr [1:3005] "1772071015_C02" "1772071017_G12" "1772071017_A05" "1772071014_B06" ...
##   ..$ : chr [1:50] "PC1" "PC2" "PC3" "PC4" ...
##  - attr(*, "varExplained")= num [1:50] 478 112.8 51.1 47 33.2 ...
##  - attr(*, "percentVar")= num [1:50] 39.72 9.38 4.25 3.9 2.76 ...
##  - attr(*, "rotation")= num [1:500, 1:50] 0.1471 0.1146 0.1084 0.0958 0.0953 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:500] "Plp1" "Trf" "Mal" "Apod" ...
##   .. ..$ : chr [1:50] "PC1" "PC2" "PC3" "PC4" ...

By default, runPCA() uses the top 500 genes with the highest variances to compute the first PCs. This can be tuned by specifying subset_row to pass in an explicit set of genes of interest, and by using ncomponents to determine the number of components to compute. The name argument can also be used to change the name of the result in the reducedDims slot.

example_sce <- runPCA(example_sce, name="PCA2",
    subset_row=rownames(example_sce)[1:1000],
    ncomponents=25)
str(reducedDim(example_sce, "PCA2"))

##  num [1:3005, 1:25] -20 -21 -23 -23.7 -21.5 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : chr [1:3005] "1772071015_C02" "1772071017_G12" "1772071017_A05" "1772071014_B06" ...
##   ..$ : chr [1:25] "PC1" "PC2" "PC3" "PC4" ...
##  - attr(*, "varExplained")= num [1:25] 153 35 23.5 11.6 10.8 ...
##  - attr(*, "percentVar")= num [1:25] 22.3 5.11 3.42 1.69 1.58 ...
##  - attr(*, "rotation")= num [1:1000, 1:25] -2.24e-04 -8.52e-05 -2.43e-02 -5.92e-04 -6.35e-03 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:1000] "Tspan12" "Tshz1" "Fnbp1l" "Adamts15" ...
##   .. ..$ : chr [1:25] "PC1" "PC2" "PC3" "PC4" ...

4.2 Other dimensionality reduction methods
We strongly recommend generating plots with different random seeds and perplexity values, to ensure that any conclusions are robust to different visualizations.

# Perplexity of 10 just chosen here arbitrarily.
set.seed(1000)
example_sce <- runTSNE(example_sce, perplexity=10)
head(reducedDim(example_sce, "TSNE"))

##                     [,1]       [,2]
## 1772071015_C02 -51.11740 -11.311505
## 1772071017_G12 -54.09988 -10.315597
## 1772071017_A05 -50.74363 -11.246481
## 1772071014_B06 -53.65486 -10.098620
## 1772067065_H06 -53.25346  -9.751226
## 1772071017_E02 -54.22199  -9.824003

A more common pattern involves using the pre-existing PCA results as input into the t-SNE algorithm.

set.seed(1000)
example_sce <- runTSNE(example_sce, perplexity=50, 
    dimred="PCA", n_dimred=10)
head(reducedDim(example_sce, "TSNE"))

example_sce <- runUMAP(example_sce)
head(reducedDim(example_sce, "UMAP"))

##                     [,1]      [,2]
## 1772071015_C02 -12.04324 -2.072196
## 1772071017_G12 -12.11942 -2.135925
## 1772071017_A05 -11.93046 -2.005134
## 1772071014_B06 -12.08696 -2.123133
## 1772067065_H06 -12.11039 -2.148298
## 1772071017_E02 -12.10755 -2.138869

4.3 Visualizing reduced dimensions

plotReducedDim(example_sce, dimred = "PCA", colour_by = "level1class")

image.png

plotTSNE(example_sce, colour_by = "Snap25")

image.png

plotPCA(example_sce, colour_by="Mog")

image.png

example_sce <- runPCA(example_sce, ncomponents=20)
plotPCA(example_sce, ncomponents = 4, colour_by = "level1class")

此图和官网教程稍有出入.png

5 Utilities for custom visualization

ggcells(example_sce, mapping=aes(x=level1class, y=Snap25)) + 
    geom_boxplot() +
    facet_wrap(~tissue)

image.png

ggcells(example_sce, mapping=aes(x=TSNE.1, y=TSNE.2, colour=Snap25)) +
    geom_point() +
    stat_density_2d() +
    facet_wrap(~tissue) +
    scale_colour_distiller(direction=1)

image.png

ggcells(example_sce, mapping=aes(x=sizeFactor, y=Actb)) +
    geom_point() +
    geom_smooth()

image.png

table(rowData(example_sce))

# endogenous       mito 
#     19972         34

colnames(example_sce) <- make.names(colnames(example_sce))
ggfeatures(example_sce, mapping=aes(x=featureType, y=X1772062111_E06)) + 
    geom_violin()

image.png

example_sce

# class: SingleCellExperiment 
# dim: 20006 3005 
# metadata(0):
#   assays(2): counts logcounts
# rownames(20006): Tspan12 Tshz1 ... mt-Rnr1 mt-Nd4l
# rowData names(1): featureType
# colnames(3005): X1772071015_C02 X1772071017_G12 ... X1772066098_A12 X1772058148_F03
# colData names(27): tissue group # ... total sizeFactor
# reducedDimNames(4): PCA PCA2 TSNE UMAP
# altExpNames(2): ERCC repeat

altExpNames(example_sce)
# [1] "ERCC"   "repeat"