10x Genomics PBMC（四）：聚类分析

作者: 程凉皮儿 | 来源:发表于2020-06-07 22:25 被阅读0次

10x Genomics PBMC（四）：聚类分析
10x Genomics PBMC（七）：整合数据后的聚类分析
10x Genomics PBMC（二）：10x Genomic
10x Genomics PBMC（八）：差异分析
10x Genomics PBMC（五）：细胞类型识别
10x Genomics PBMC（三）：降维和聚类
sc-RNA-seq || 10x Genomics-Loupe
Cellranger原理介绍（上）
单细胞测序平台的比较（2014-2020）
使用SnapATAC分析单细胞ATAC-seq数据（二）：10X

Cluster Analysis

clp

07 June, 2020

寻找差异表达特征(cluster biomarkers)

Seurat可以帮助您通过差异表达找到定义clusters的标记(markers)。默认情况下，相对于所有其他细胞，它识别单个簇(在ident.1中指定)的正负标记。FindAllMarkers为所有clusters自动执行此过程，但您也可以对clusters组进行相互测试，或针对所有细胞进行测试。

min.pct 参数要求在两组细胞中的任一组中以最小百分比检测到特征，而thresh.test参数要求在两组细胞之间以一定数量(平均)差异表示特征。您可以将这两个值都设置为0，但会大大增加时间-因为这将测试大量不太可能具有高度特异性(有生物学意义)的特性。作为加速这些计算的另一个选项，可以设置max.cells.per.ident。这将对每个标识类进行下游采样，使其细胞数不超过设置的值。虽然通常会失去效力，但速度的提高可能会很显著，最具差异性的特征值可能仍然会上升到顶端。

加载前面获取的数据和必要的R包

library(Seurat)
library(ggplot2)
library(data.table)

if (!('pbmc' %in% ls())) {
    load("filtered_gene_bc_matrices/hg19/02_pbmc3k_cluster.rd")
}
#check which assay set to default!
DefaultAssay(pbmc)
#> [1] "SCT"

让我们将一个簇与其余簇进行比较，以便与其他簇相比，哪些基因表达较高，哪些基因表达较低。

# find markers for every cluster compared to all remaining cells, report both positives and negatives. The output DGE table is converted to data.table for better handling,
pbmc.markers <- as.data.table(FindAllMarkers(pbmc, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25))
pbmc.markers <- pbmc.markers[!is.na(gene),]

# We collect top 10 distinct genes for each cluster
top10 <- pbmc.markers[,.SD[order(-avg_logFC)][1:10],by='cluster']
top10 <- top10[!is.na(gene),]

View(top10)

Slots vs. DGE method

在归一化步骤中，SCTransform在有3个可用slot的地方创建SCT分析。FindAllMarkers选项中有多个dge方法。根据哪种DGE方法，我们必须提供适当的slot。

Task: Explore the function, FindAllMarkers to see what the function is and which options are available!

#?FindAllMarkers()

查找数据集中每个标识类的标记(差异表达的基因)

可视化基因表达水平

我们包括几个用于可视化标记表达式的工具。VlnPlot(展示跨clusters的表达概率分布)和FeaturePlot(在tSNE或PCA图上可视化特征值表达)是我们最常用的可视化方式。我们还建议使用RidgePlot, CellScatter和DotPlot作为查看数据集的其它方法。

# Note that the default slot is 'data' (log normalized)
Idents(pbmc) <- "seurat_clusters"
goi1 <- c("CD8A", "GZMK", "CCL5", "S100A4", "ANXA1", "CCR7", "ISG15", "CD3D")
VlnPlot(pbmc, features = goi1, ncol=4)

image.png


# Gene expression on UMAP
goi2 <- sort(c("MS4A1", "GNLY", "CD3E", "CD14", "FCER1A", "FCGR3A", "LYZ", "PPBP", "CD8A"))
FeaturePlot(pbmc, features = goi2)

image.png


# Use ROC model for DGE testing,
pbmc.markers.roc <- as.data.table(FindAllMarkers(pbmc, only.pos = TRUE, test.use='roc', min.pct = 0.25, logfc.threshold = 0.25))
pbmc.markers.roc <- pbmc.markers.roc[!is.na(gene),]

top1 <- pbmc.markers.roc[,.SD[order(-myAUC)][1],by='cluster']
FeaturePlot(pbmc, features = sort(top1$gene))

image.png

DoHeatmap为给定的细胞和features表达生成热图。在本例中，我们绘制每个cluster的前10个标记(如果少于20个，则绘制所有标记)。

DoHeatmap(pbmc, features = top10$gene) + NoLegend()

image.png

Save DGE table.

save(pbmc.markers,top10,top1,file='filtered_gene_bc_matrices/hg19/03_pbmc3k_clusterAnalysis.rd',compress=TRUE)

本节重点

Differential gene expression analysis to describe clusters
To choose an appropriate matrix upon DGE method
data.table

网友评论

本文标题：10x Genomics PBMC（四）：聚类分析

本文链接：https://www.haomeiwen.com/subject/moghtktx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

10x Genomics PBMC（四）：聚类分析

Cluster Analysis

clp

07 June, 2020

寻找差异表达特征(cluster biomarkers)

Slots vs. DGE method

可视化基因表达水平

本节重点

相关文章

10x Genomics PBMC（四）：聚类分析

10x Genomics PBMC（七）：整合数据后的聚类分析

10x Genomics PBMC（二）：10x Genomic

10x Genomics PBMC（八）：差异分析

10x Genomics PBMC（五）：细胞类型识别

10x Genomics PBMC（三）：降维和聚类

sc-RNA-seq || 10x Genomics-Loupe

Cellranger原理介绍（上）

单细胞测序平台的比较（2014-2020）

使用SnapATAC分析单细胞ATAC-seq数据（二）：10X

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读