美文网首页
单细胞数据的二次分群

单细胞数据的二次分群

作者: 小洁忘了怎么分身 | 来源:发表于2024-05-23 10:54 被阅读0次

    0.背景知识

    Seurat里的FindClusters函数设置的resolution数值越大,分群的数量就越多,但是当单细胞数量太多的时候,会遇到resolution再变大,分群的数量也不再增加的情况。一次分群分不开时就会需要二次分群。

    1.示例数据

    这里的示例数据seu.obj.Rdata是GSE218208降维聚类分群注释的结果,在生信星球聊天框回复“955gsva”获取(和昨天的数据一样,里面的gmt今天是用不到的)。

    rm(list = ls())
    library(Seurat)
    library(dplyr)
    load("../24.5.23scGSVA/seu.obj.Rdata")
    p1 = DimPlot(seu.obj, reduction = "umap",label=T)+NoLegend()
    p1
    

    2.二次分群

    这里以树突细胞(DC)为例进行二次分群

    核心就是提取感兴趣的亚群的细胞,后面就是标准流程和可视化了,没有区别

    sub.cells <- subset(seu.obj, idents = "DC")
    f = "obj.Rdata"
    if(!file.exists(f)){
      sub.cells = sub.cells %>%
      NormalizeData() %>%
      FindVariableFeatures() %>%
      ScaleData(features = rownames(.)) %>%
      RunPCA(features = VariableFeatures(.))  %>%
      FindNeighbors(dims = 1:15) %>%
      FindClusters(resolution = 0.5) %>%
      RunUMAP(dims = 1:15) 
      save(sub.cells,file = f)
    }
    load(f)
    DimPlot(sub.cells, reduction = 'umap',label = T)+NoLegend()
    

    3.marker基因及其可视化

    sub.cells.markers <- FindAllMarkers(sub.cells, only.pos = TRUE,  
                                min.pct = 0.25, logfc.threshold = 0.25)
    
    top10 <- sub.cells.markers %>% 
      group_by(cluster) %>% 
      top_n(n = 10, wt = avg_log2FC) %>% 
      pull(gene);top10
    
    ##  [1] "JCHAIN"  "IGKC"    "MZB1"    "PACSIN1" "WNT10A"  "MAP1A"   "VASH2"  
    ##  [8] "NIBAN3"  "SMPD3"   "TNFRSF4" "LYZ"     "TIMP1"   "GPAT3"   "ITGAX"  
    ## [15] "SAMSN1"  "OLR1"    "FPR3"    "EREG"    "FCN1"    "AKAP12"
    
    VlnPlot(sub.cells, features = top10)
    
    RidgePlot(sub.cells, features = top10)
    
    FeaturePlot(sub.cells, features = top10)
    
    DotPlot(sub.cells,features = top10)+ RotatedAxis()
    
    DoHeatmap(sub.cells, features = top10) + NoLegend()
    

    4.放回原来的Seurat对象里面

    上面的umap图是感兴趣的单独的展示,也可以把它放回原来的seurat对象里。

    sub.cells@meta.data$celltype = paste0("M",sub.cells$seurat_clusters)
    
    seu.obj$celltype = as.character(Idents(seu.obj))
    seu.obj$celltype = ifelse(seu.obj$celltype=="DC",
           sub.cells$celltype[match(colnames(seu.obj),colnames(sub.cells))],
           seu.obj$celltype) 
    Idents(seu.obj) = seu.obj$celltype
    p2 = DimPlot(seu.obj,label = T)+NoLegend()
    p1+p2
    

    相关文章

      网友评论

          本文标题:单细胞数据的二次分群

          本文链接:https://www.haomeiwen.com/subject/uyjtqjtx.html