美文网首页单细胞测序
考虑空间位置的通讯分析手段---CellphoneDB(V3.0

考虑空间位置的通讯分析手段---CellphoneDB(V3.0

作者: 单细胞空间交响乐 | 来源:发表于2022-01-24 10:19 被阅读0次

    新的一周,新的开始,今天我们来分享一个cellphoneDB的升级版,CellphoneDB的V3.0版本,升级的最大改进在于服务于空间转录组的通讯分析,会结合空间位置和生态位进行有效通讯的识别和判断,我们来看一下。

    关于CellphoneDB,大家应该都不陌生,目前做通讯分析引用率最高的软件,之前的版本升级到2之前都是服务于单细胞的数据分析的,但是现在作者将软件升级到3,就是将空间转录组纳入分析,非常棒的想法和运用,参考文章在Mapping the temporal and spatial dynamics of the human endometrium in vivo and in vitro,2021年12月份发表于Nature Genetics,IF38.330,文献的内容我们下一篇再分享,这一片我们专注于CellphoneDB(V3.0)。

    CellphoneDB(V3.0)的创新点

    • 合并空间信息 CellPhoneDB 现在允许通过微环境文件合并细胞的空间信息。 这是一个两列文件,指示哪种细胞类型在哪个空间微环境中(参见示例见下图)。 CellphoneDB 将使用此信息来定义可能的交互细胞对(即在微环境中共享/共存的集群对)。 可以使用 cell2location(关于cell2location分享了多次,大家可以参考文章 10X单细胞和空间联合分析的方法---cell2location,10X单细胞空间联合分析之再次解读cell2location 定义具有先验知识、成像或 Visium 分析的微环境。
    cell_type microenviroment
    epi_Ciliated Proliferative
    epi_Pre-ciliated Proliferative
    epi_SOX9_LGR5 Proliferative
    epi_SOX9_prolif Proliferative
    epi_SOX9 Proliferative
    Fibroblast eS Proliferative
    Lymphoid Proliferative
    Myeloid Proliferative
    Fibroblast C7 Proliferative
    epi_Ciliated Secretory
    epi_Lumenal 1 Secretory
    epi_Lumenal 2 Secretory
    epi_Glandular Secretory
    epi_Glandular_secretory Secretory
    Fibroblast dS Secretory
    Lymphoid Secretory
    epi_SOX9 Secretory
    Fibroblast C7 Secretory
    Myeloid Secretory
    • 添加了新的分析方法,使用差异表达基因 (DEG) 而不是 random shuffling(cellphonedb 方法 degs_analysis)。 这种方法将选择所有基因都由高于--阈值的一小部分细胞表达并且至少一个基因是 DEG 的相互作用。 可以使用喜欢的工具识别 DEG,并通过文本文件将信息提供给 CellphoneDB。 第一列应该是细胞类型/cluster,第二列应该是相关的基因 id。 其余列将被忽略(参见示例见下表)。 这里提供了为 Seurat 和 Scanpy 的参考示例。
    图片.png
    • Database update WNT pathway has been further curated.
    图片.png

    先来看看R版本的分析(联合Seurat)

    library(Seurat)
    library(SeuratObject)
    library(Matrix)
    so = readRDS('endometrium_example_counts_seurat.rds')
    writeMM(so@assays$RNA@counts, file = 'endometrium_example_counts_mtx/matrix.mtx')
    # save gene and cell names
    write(x = rownames(so@assays$RNA@counts), file = "endometrium_example_counts_mtx/genes.tsv")
    write(x = colnames(so@assays$RNA@counts), file = "endometrium_example_counts_mtx/barcodes.tsv")
    
    so@meta.data$Cell = rownames(so@meta.data)
    df = so@meta.data[, c('Cell', 'cell_type')]
    write.table(df, file ='endometrium_example_meta.tsv', sep = '\t', quote = F, row.names = F)
    
    
    ## OPTION 1 - compute DEGs for all cell types
    ## Extract DEGs for each cell_type
    # DEGs <- FindAllMarkers(so, 
    #                        test.use = 'LR', 
    #                        verbose = F, 
    #                        only.pos = T, 
    #                        random.seed = 1, 
    #                        logfc.threshold = 0.2, 
    #                        min.pct = 0.1, 
    #                        return.thresh = 0.05)
    
    
    # OPTION 2 - optional - Re-compute  hierarchical (per lineage) DEGs for Epithelial and Stromal lineages
    DEGs = c()
    for( lin in c('Epithelial', 'Stromal') ){
        message('Computing DEGs within linage ', lin)
        so_in_lineage = subset(so, cells = Cells(so)[ so$lineage == lin ] )
        celltye_in_lineage = unique(so$cell_type[ so$lineage == lin ])
        DEGs_lin = FindAllMarkers(so_in_lineage,
                           verbose = F, 
                           only.pos = T, 
                           random.seed = 1, 
                           logfc.threshold = 0, 
                           min.pct = 0.1, 
                           return.thresh = 0.05)
        DEGs = rbind(DEGs_lin, DEGs)
    }
    
    
    fDEGs = subset(DEGs, p_val_adj < 0.05 & avg_logFC > 0.1)
    
    # 1st column = cluster; 2nd column = gene 
    fDEGs = fDEGs[, c('cluster', 'gene', 'p_val_adj', 'p_val', 'avg_logFC', 'pct.1', 'pct.2')] 
    write.table(fDEGs, file ='endometrium_example_DEGs.tsv', sep = '\t', quote = F, row.names = F)
    
    

    Run cellphoneDB

    cellphonedb method degs_analysis  \
        endometrium_example_meta.tsv  \
        endometrium_example_counts_mtx  \
        endometrium_example_DEGs.tsv  \
        --microenvs endometrium_example_microenviroments.tsv  \ #optional
        --counts-data hgnc_symbol  \
        --database database/database/cellphonedb_user_2021-06-29-11_41.db \
        --threshold 0.1
    

    python版本的分析

    import numpy as np
    import pandas as pd
    import scanpy as sc
    import anndata
    import os
    import sys
    from scipy import sparse
    
    sc.settings.verbosity = 1  # verbosity: errors (0), warnings (1), info (2), hints (3)
    sys.executable
    adata = sc.read('endometrium_example_counts.h5ad')
    df_meta = pd.DataFrame(data={'Cell':list(adata.obs.index),
                                 'cell_type':[ i for i in adata.obs['cell_type']]
                                })
    df_meta.set_index('Cell', inplace=True)
    df_meta.to_csv('endometrium_example_meta.tsv', sep = '\t')
    
    # Conver to dense matrix for Seurat
    adata.X = adata.X.toarray()
    
    import rpy2.rinterface_lib.callbacks
    import logging
    # Ignore R warning messages
    #Note: this can be commented out to get more verbose R output
    rpy2.rinterface_lib.callbacks.logger.setLevel(logging.ERROR)
    import anndata2ri
    anndata2ri.activate()
    %load_ext rpy2.ipython
    
    %%R -o DEGs
    
    library(Seurat)
    so = as.Seurat(adata, counts = "X", data = "X")
    Idents(so) = so$cell_type
    
    ## OPTION 1 - compute DEGs for all cell types
    ## Extract DEGs for each cell_type
    # DEGs <- FindAllMarkers(so, 
    #                        test.use = 'LR', 
    #                        verbose = F, 
    #                        only.pos = T, 
    #                        random.seed = 1, 
    #                        logfc.threshold = 0.2, 
    #                        min.pct = 0.1, 
    #                        return.thresh = 0.05)
    
    
    # OPTION 2 - optional - Re-compute  hierarchical (per lineage) DEGs for Epithelial and Stromal lineages
    DEGs = c()
    for( lin in c('Epithelial', 'Stromal') ){
        message('Computing DEGs within linage ', lin)
        so_in_lineage = subset(so, cells = Cells(so)[ so$lineage == lin ] )
        celltye_in_lineage = unique(so$cell_type[ so$lineage == lin ])
        DEGs_lin = FindAllMarkers(so_in_lineage, 
                           test.use = 'LR', 
                           verbose = F, 
                           only.pos = T, 
                           random.seed = 1, 
                           logfc.threshold = 0.2, 
                           min.pct = 0.1, 
                           return.thresh = 0.05)
        DEGs = rbind(DEGs_lin, DEGs)
    }
    
    cond1 = DEGs['p_val_adj'] < 0.05 
    cond2 = DEGs['avg_log2FC'] > 0.1
    mask = [all(tup) for tup in zip(cond1, cond2)]
    fDEGs = DEGs[mask]
    
    
    # 1st column = cluster; 2nd column = gene 
    fDEGs = fDEGs[['cluster', 'gene', 'p_val_adj', 'p_val', 'avg_log2FC', 'pct.1', 'pct.2']] 
    fDEGs.to_csv('endometrium_example_DEGs.tsv', index=False, sep='\t')
    

    Run cellphoneDB

    cellphonedb method degs_analysis  \
        endometrium_example_meta.tsv  \
        endometrium_example_counts.h5ad  \
        endometrium_example_DEGs.tsv  \
        --microenvs endometrium_example_microenviroments.tsv  \
        --counts-data hgnc_symbol  \
        --database database/database/cellphonedb_user_2021-06-29-11_41.db \
        --threshold 0.1
    
    图片.png

    至于cellphoneDB的绘图操作,大家可以参考文章空间通讯分析章节2,软件链接在CellphoneDB.

    生活很好,有你更好

    相关文章

      网友评论

        本文标题:考虑空间位置的通讯分析手段---CellphoneDB(V3.0

        本文链接:https://www.haomeiwen.com/subject/qgqbhrtx.html