拟时序涉及到取亚群,这里我在文献里看到过,例如,取成纤维细胞是选择 PDGFRΑ 基因阳性的细胞。但是PBMC的单核细胞好像不是用某个基因取的子集。所以,这个应该是不同细胞亚群,有不同的取子集方法。
一、按照基因阳性取
PDGFRΑ+ stromal fibroblasts from the colon were distinguishable in multiple cell clusters in all mice (Figure 3A).Skin inflammation activates intestinal stromal fibroblasts and promotes colitis - PMC (nih.gov)
UMAP plot after subclustering of Pdgfra+ fibroblasts, showing 10 distinct subtypes (0 to 9) colored by cluster. Antimicrobial production by perifollicular dermal preadipocytes is essential to the pathophysiology of acne - PMC (nih.gov)
转自:玩转单细胞---提取特定基因表达的细胞群分析(一个小问题
读之前帖子的文献,发现一个问题,后面单细胞亚群分析的时候它是提取的表达特定基因的细胞群,而不是直接提取命名好的细胞群。虽然只是一个小问题,还是值得说一下,这也是一种思路,有时候可能仅关注特定表达基因细胞群分析会更加有意义!
这是已经定好群的细胞:
library(Seurat)
library(dplyr)
DimPlot(mouse_data, label = T)+NoLegend()
FeaturePlot(mouse_data, features = 'Il1b', pt.size = 1)
image.png
可以看出,Il1b的表达并不是在特定细胞群,而是有穿插,我们需要研究这群细胞,就需要将其提取。思路很简单,那就是获得表达Il1b基因细胞的ID,然后提取即可:
expr <- mouse_data@assays$RNA@data
gene_expression <- expr %>% .["Il1b",] %>% as.data.frame()
colnames(gene_expression) <- "Il1b"
gene_expression$cell <- rownames(gene_expression)
#选择阳性细胞
gene_expression_sel <- gene_expression[which(gene_expression$Il1b>0),]
mouse_sel_sce <- mouse_data[,rownames(gene_expression_sel)]
作图看看:
DimPlot(mouse_sel_sce, label = T, pt.size = 1)+NoLegend()
FeaturePlot(mouse_sel_sce, features = 'Il1b', pt.size = 1)
image.png
可以看出,这就是我们需要选择的细胞!之后亚群分析走标准流程即可!虽然是一个小问题,但还是挺有用的,觉得分享有用的点个赞、关注一下呗!
二、就按照普通方法取
scRNA-Seq | 单细胞亚群合并与提取 - 简书 (jianshu.com)
levels(sce)
# [1] "Naive CD4 T" "CD14+ Mono" "Memory CD4 T" "B" "CD8 T" "FCGR3A+ Mono"
# [7] "NK" "DC" "Platelet"
genes_to_check = c('PTPRC', 'CD3D', 'CD3E',
'CD4','IL7R','NKG7','CD8A')
p1=DimPlot(sce, reduction = 'umap', group.by = 'seurat_clusters',
label = TRUE, pt.size = 0.5) + NoLegend()
p2=DotPlot(sce, group.by = 'seurat_clusters',
features = unique(genes_to_check)) + RotatedAxis()
p1+p2
image.png
假设这个时候,我们想提取CD4的T细胞,那么根据上文聚类0/2/4/6均为T细胞,其中0和2表达CD4相对4/6较高,但是其实示例里面的CD4的T细胞并不怎么高表达CD4,在此不深究,继续向下走。
提取指定单细胞亚群,这三种取法都可以
cd4_sce1 = sce[,sce@meta.data$seurat_clusters %in% c(0,2)]
cd4_sce2 = sce[, Idents(sce) %in% c( "Naive CD4 T" , "Memory CD4 T" )]
cd4_sce3 = subset(sce,seurat_clusters %in% c(0,2))
## 较简单,核心原理就是R里取子集的3种策略:逻辑值,坐标,名字
重新降维聚类分群
sce=cd4_sce1
sce <- NormalizeData(sce, normalization.method = "LogNormalize", scale.factor = 1e4)
sce <- FindVariableFeatures(sce, selection.method = 'vst', nfeatures = 2000)
sce <- ScaleData(sce, vars.to.regress = "percent.mt")
sce <- RunPCA(sce, features = VariableFeatures(object = sce))
sce <- FindNeighbors(sce, dims = 1:10)
sce <- FindClusters(sce, resolution = 1 )
head(Idents(sce), 5)
table(sce$seurat_clusters)
sce <- RunUMAP(sce, dims = 1:10)
DimPlot(sce, reduction = 'umap')
genes_to_check = c('PTPRC', 'CD3D', 'CD3E', 'FOXP3',
'CD4','IL7R','NKG7','CD8A')
DotPlot(sce, group.by = 'seurat_clusters',
features = unique(genes_to_check)) + RotatedAxis()
# 亚群水平
p1=DimPlot(sce, reduction = 'umap', group.by = 'seurat_clusters',
label = TRUE, pt.size = 0.5) + NoLegend()
p2=DotPlot(sce, group.by = 'seurat_clusters',
features = unique(genes_to_check)) + RotatedAxis()
p1+p2
取子集 - 简书 (jianshu.com)
R里面取子集可以根据,逻辑值 或者 是坐标 取。
逻辑值 是判断要不要取,坐标是告诉他取哪个坐标
cd4_sce1 = sec [,sce@metadata$seurat_clusters %in% c(0,2)] # 逻辑值
cd4_sce2 = sec [, Idents(sce) %in% c("Naive CD4T","Memory CD4T")] # 和上面的效果一样
# subset 函数也可以
网友评论