美文网首页
单细胞转录组分析Monocle

单细胞转录组分析Monocle

作者: 新欣enjoy | 来源:发表于2020-01-14 11:05 被阅读0次

    引言: 在学习并实践了单细胞转录组分析的Seurat工具包之后,下一步分析就是细胞发育轨迹重建的Monocle包了,在此记录下学习过程。因为服务器系统不能安装Monocle3, 因此所有的学习都是基于Monocle2。

    Monocle

    Monocle2 官网链接

    Monocle 拟时分析

    主要步骤

    1. 创建CellDataSet数据对象,官方建议不能使用已经标准化之后的数据,(或者使用归一化后的数据,也服从正态分布?)注意选择合适的数据分布函数
    2. Estimate size factors and dispersions,仅当数据符合negbinomial() or negbinomial.size() 分布时
    3. 选择能定义细胞发育过程的基因
    4. 降维
    5. 根据轨迹排序细胞

    主要分析步骤代码

    library(monocle)
    library(Seurat)
    library(clusterProfiler)
    library(dplyr)
    #1. 从Seurat 对象中提取monocle 对象基本元素
    load("Rdata/subara_integrated_seurat.RData") #subara_integrated
    
    head(subara_integrated[[]])
    levels(subara_integrated)
    ara.m.pd = subara_integrated@meta.data[,c(1,4,7)]
    ara.m.expr = subara_integrated@assays$RNA@counts[,rownames(ara.m.pd)]
    ara.m.fd = rownames(ara.m.expr) %>% bitr(,fromType = "TAIR", toType =c("SYMBOL") ,OrgDb= "org.At.tair.db")
    m.fd = ara.m.fd[!duplicated(ara.m.fd$TAIR),]
    m.expr = left_join(m.fd, data.frame(TAIR = rownames(ara.m.expr),as.data.frame(ara.m.expr)),by = "TAIR")
    # 几点注意:expr 的行与 fd 的行数一致,expr的列和pd的行数一致。且fd, pd 数据集的行名对应到expr的行名和列名,且fd 中必须有一列“gene_short_name”,  pd 数据集中信息越多,后续分析点越多
    
    #2. 创建monocle数据对象并 Estimate size factors and dispersions
    counts.data <- as(ara.m.matrix, 'sparseMatrix')  # sparseMatrix is suitable for large dataset 
    pheno.data <- new('AnnotatedDataFrame', data = sub.m.pd)
    feature.data <- new('AnnotatedDataFrame', data = sub.m.fd)
    cds <- newCellDataSet(counts.data, 
                          into mRNA counts,
                          phenoData = pheno.data, 
                          featureData = feature.data, 
                          lowerDetectionLimit = 0.5, 
                          expressionFamily = negbinomial.size()) #negbinomial.size(),  tobit(), gaussianff()
    
    cds <- estimateSizeFactors(cds)   # give Size_Factor values
    cds <- estimateDispersions(cds)   # time costly
    
    #3. 构建轨迹,拟时分析
    # 选择基因
    # second methods: using algorithm that based on average expression level
    disp_table <- dispersionTable(cds)  # based on average expression level,
    ordering_genes <- subset(disp_table, mean_expression >= 0.1)
    cds <- setOrderingFilter(cds, ordering_genes)  #none gene can be used for ordering?????
    save_plot("subara_integrated_out_results/ordering_genes_dispersion.png",plot_ordering_genes(cds),base_height = 8,base_asp = 1.3)
    
    # 降维
    cds <- reduceDimension(cds, max_components = 2,method = 'DDRTree')  # run very long time 
    
    # 排序,构建轨迹
    cds <- orderCells(cds)  # out of memory
    head(pData(cds))
    save_plot("subara_integrated_out_results/cell_trajectory.png",plot_cell_trajectory(cds, color_by = "Hours"),base_height = 8,base_asp = 1.3)
    
    ## branch point analysis
    BEAM_res <- BEAM(cds, branch_point = 1, cores = 1)
    save(BEAM_res, file = "Rdata/list61_sub_cluster_seurat/BEMA_res_object.Rdata")
    BEAM_res <- BEAM_res[order(BEAM_res$qval),]
    BEAM_res <- BEAM_res[,c("gene_short_name", "pval", "qval")]
    cds.symbol.unique = cds[row.names(subset(BEAM_res,qval < 1e-4 & !duplicated(BEAM_res$gene_short_name))),]
    save_plot("subara_integrated_out_results/branch_gene_heatmap_symbol.png",plot_genes_branched_heatmap(cds.symbol.unique,
                                branch_point = 1,
                                num_clusters = 3,
                                cores = 1,
                                use_gene_short_name = T,
                                show_rownames = T),base_height = 8,base_asp = 1.3)
    
    
    

    其他数据结构查看代码

    ### 主要函数
    head(pData(cds))
    head(fData(cds))
    

    相关文章

      网友评论

          本文标题:单细胞转录组分析Monocle

          本文链接:https://www.haomeiwen.com/subject/wdwoactx.html