美文网首页毕论文章设计方法参考文章
寻找差异基因到底使用哪个?FindMarkers、Findall

寻找差异基因到底使用哪个?FindMarkers、Findall

作者: oceanandshore | 来源:发表于2023-05-10 09:44 被阅读0次

    Q: 得到的差异基因和保守基因,用哪个作为 marker 基因好? / FindallMarkers 和findconservedmarkers 结果上有什么区别吗?

    A: 如果只有一个样本,就用 cluster 中的差异表达基因做 marker 基因。如果做了数据整合,比如处理组和对照组,就要先用 FindConservedMarkers 函数找到保守的差异表达基因,用它们来作为 marker 基因对 cluster 做注释,注释完以后,再用 FindMarkers 函数比较这个 cluster 中的处理组细胞和对照组细胞之间的差异表达基因。

    需要解答:那要是不止两组,有好几组,用哪个函数呢?解决方案___

    转自:如何使用 Seurat 分析单细胞测序数据( Q&A - 简书 (jianshu.com)

    FindConservedMarkers实例
    Seurat之整合分析(1) - 简书 (jianshu.com)
    2020-05-18 seurat 包 Stimulated vs Control PBMCs - 简书 (jianshu.com)
    单细胞测序数据整合分析示例 - 简书 (jianshu.com)
    10x Genomics PBMC(七):整合数据后的聚类分析 - 简书 (jianshu.com)

    2023.05.11

    昨天找到了一篇帖子:【细胞通讯】PlantPhoneDB(2),是植物的样本,实验组和对照组。用SCT标准化,FindIntegrationAnchors做了去批次。作者在做完FindClusters之后,关键的两步是DefaultAssay(objs) <- 'SCT'objs<- PrepSCTFindMarkers(objs,assay = "SCT", verbose = TRUE) 贴一下作者的代码: 先按照这个跑一下看看

    library(Seurat)
    
    library(tidyverse)
    
    library(ggplot2)
    
    library(ggsci)
    
    library(ggpubr)
    
    library(pheatmap)
    
    library(RColorBrewer)
    
    library(patchwork)
    
    library(lsa)
    
    library(viridis)
    
    library(hrbrthemes)
    
    library(circlize)
    
    library(chorddiag)
    
    library(ggplotify)
    
    library(data.table)
    
    library(parmigene)
    
    library(readxl)
    
    library(infotheo)
    
    library(igraph)
    
    library(muxViz)
    
    library(rgl)
    
    library(tidyverse)
    
    library(dplyr)
    
    //参数选择和作者paper中用的参数一样
    
    pbmc <- readRDS("pbmc3k_final.rds")
    
    pbmc@meta.data$labels <- Idents(pbmc)
    
    control.data <- Read10X(data.dir = "control/filtered_feature_bc_matrix/")
    
    control<- CreateSeuratObject(counts = control.data, project = "control", min.cells = 3, min.features = 200)
    
    control <- subset(control, subset = nFeature_RNA > 200 & nCount_RNA > 1000)
    
    control <- SCTransform(control, verbose = FALSE)
    
    heat.data <- Read10X(data.dir = "heat/filtered_feature_bc_matrix/")
    
    heat<- CreateSeuratObject(counts = heat.data, project = "heat", min.cells = 3, min.features = 200)
    
    heat <- subset(heat, subset = nFeature_RNA > 200 & nCount_RNA > 1000)
    
    heat <- SCTransform(heat, verbose = FALSE)
    
    datasets <- c(control,heat)
    
    features <- SelectIntegrationFeatures(object.list = datasets, nfeatures = 8000)
    
    datasets <- PrepSCTIntegration(object.list = datasets, anchor.features = features, verbose = TRUE)
    
    datasets <- lapply(X = datasets, FUN = RunPCA, verbose = FALSE, features = features)
    
    anchors <- FindIntegrationAnchors(object.list = datasets, normalization.method = "SCT",anchor.features = features, verbose = TRUE, reference=1,reduction = "cca")
    
    objs <- IntegrateData(anchorset = anchors, normalization.method = "SCT", verbose = TRUE)
    
    objs <- RunPCA(objs, verbose = FALSE, approx = FALSE, npcs = 50)
    
    objs <- RunUMAP(objs, reduction = "pca", dims = 1:50, umap.method = "umap-learn", metric = "correlation")
    
    objs <- RunTSNE(objs, reduction = "pca",dims = 1:50,tsne.method = "Rtsne")
    
    objs <- FindNeighbors(objs, reduction = "pca",dims = 1:50)
    
    objs <- FindClusters(objs, resolution = 0.5, algorithm = 2)
    
    DefaultAssay(objs) <- 'SCT'
    
    objs<- PrepSCTFindMarkers(objs,assay = "SCT", verbose = TRUE)
    
    DEG <- FindAllMarkers(objs,
    
                          logfc.threshold=0.25,
    
                          min.diff.pct = 0.25,
    
                          max.cells.per.ident = 10000,
    
                          only.pos=T)
    
    mark_gene <- DEG %>% mutate(avg_logFC=avg_log2FC) %>% filter(p_val_adj<0.05)
    
    signature <- readxl::read_excel('../ath_doi_202104.xlsx')
    
    sig_gene  <- signature %>% as.data.frame() %>% filter(Tissue=="Root") %>% mutate(V1=`Cell Type`,V2=Cell_Marker) %>% unique(.) %>% select(V1,V2)
    

    贴一下报错,搜了一下ggrepel: 7 unlabeled data points (too many overlaps). Consider increasing max.overlaps ,说不要紧

     DefaultAssay(integrated) <- 'SCT'
    > 
    > integrated<- PrepSCTFindMarkers(integrated,assay = "SCT", verbose = TRUE)
    Found 6 SCT models. Recorrecting SCT counts using minimum median counts: 2575
    > 
    > DEG <- FindAllMarkers(integrated,
    +                       
    +                       logfc.threshold=0.25,
    +                       
    +                       min.diff.pct = 0.25,
    +                       
    +                       only.pos=T)
    Calculating cluster 0
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02m 41s
    Calculating cluster 1
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=33s  
    Calculating cluster 2
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=13s  
    Calculating cluster 3
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=39s  
    Calculating cluster 4
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=20s  
    Calculating cluster 5
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=04s  
    Calculating cluster 6
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=50s  
    Calculating cluster 7
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=21s  
    Calculating cluster 8
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02m 24s
    Calculating cluster 9
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=52s  
    Calculating cluster 10
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=29s  
    Calculating cluster 11
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=30s  
    Calculating cluster 12
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=04m 21s
    Calculating cluster 13
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=19s  
    Calculating cluster 14
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=16s  
    Calculating cluster 15
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02m 02s
    Calculating cluster 16
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01m 47s
    Calculating cluster 17
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=03m 22s
    Calculating cluster 18
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=48s  
    Calculating cluster 19
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=28s  
    Calculating cluster 20
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01m 53s
    Calculating cluster 21
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=54s  
    Calculating cluster 22
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01m 11s
    Calculating cluster 23
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01m 05s
    Warning messages:
    1: ggrepel: 7 unlabeled data points (too many overlaps). Consider increasing max.overlaps 
    2: In UseMethod("depth") :
      no applicable method for 'depth' applied to an object of class "NULL"
    3: In UseMethod("depth") :
      no applicable method for 'depth' applied to an object of class "NULL"
    4: In UseMethod("depth") :
      no applicable method for 'depth' applied to an object of class "NULL"
    > 
    

    相关文章

      网友评论

        本文标题:寻找差异基因到底使用哪个?FindMarkers、Findall

        本文链接:https://www.haomeiwen.com/subject/jxkdsdtx.html