美文网首页单细胞
非模式物种的RNA速率分析(二)

非模式物种的RNA速率分析(二)

作者: 三点水的番薯 | 来源:发表于2022-07-05 17:16 被阅读0次

    loom文件生成后就是Seurat结果和loom文件的整合了
    可以分为在R中Seurat提取分组及坐标信息,以及python中整合loom文本和Seurat的信息以及可视化。

    高亮:这里主要是要注意loom文件中barcode样本名称前缀的和Seurat中Cell ID的前缀要吻合

    参考:https://www.jianshu.com/p/70c19748ac1a

    (一)提取Seruat中的信息

    table(seurat_obj$celltype)
    # 获得每个细胞的UMAP或TSNE坐标,使用 Embeddings函数
    write.csv(Embeddings(seurat_obj, reduction = "umap"), file = "cell_embeddings.csv")
    # 获取每个细胞的barcode
    write.csv(Cells(seurat_obj), file = "cellID_obs.csv", row.names = FALSE)
    # 提取每个细胞的cluster信息
    write.csv(seurat_obj@meta.data[, 'cluster', drop = FALSE], file = "cell_clusters.csv")
    # 提取每个细胞的celltype信息
    write.csv(seurat_obj@meta.data[, 'celltype', drop = FALSE], file = "cell_celltype.csv")
    # 获取celltype的颜色信息
    hue_pal()(length(levels(seurat_obj$celltype)))
    # 获取cluster的颜色信息
    hue_pal()(length(levels(seurat_obj$cluster)))
    

    (二)多个loom文件整合

    import loompy
    files=['sample_one','sample_two']
    output_filename='combined.loom'
    loompy.combine(files, output_filename, key="Accession")
    

    (三) loom文件和Seurat合并

    import scvelo as scv
    import pandas as pd
    import numpy as np
    import os
    
    loom_data = scv.read('combined.loom', cache=False)
    #这里就是给loom文件中的barcode做替换,使其与Seurat中一致
    loom_data.obs = loom_data.obs.rename(index = lambda x: x.replace(':', '_').replace('x', '-1').replace('*****','*****').replace('******','*****'))
    loom_data.obs.head()
    #读取seurat中的meta信息
    meta_path = "/path/"
    sample_obs = pd.read_csv(os.path.join(meta_path, "cellID_obs.csv"))
    cell_umap= pd.read_csv(os.path.join(meta_path, "cell_embeddings.csv"), header=0, names=["CellID", "UMAP_1", "UMAP_2"])
    cell_clusters = pd.read_csv(os.path.join(meta_path, "cell_clusters.csv"), header=0, names=["CellID", "cluster"])
    cell_celltype = pd.read_csv(os.path.join(meta_path, "cell_celltype.csv"), header=0, names=["CellID", "celltype"])
    #对细胞文件和RNA剪切速率文件取交集,保留关注的细胞类型
    sample_one = loom_data[np.isin(loom_data.obs.index, sample_obs)]
    sample_one.obs.head()
    #构建umap坐标,cluster,celltype信息数据框
    sample_one_index = pd.DataFrame(sample_one.obs.index)
    sample_one_index = sample_one_index.rename(columns = {0:'CellID'})
    
    umap_ordered = sample_one_index.merge(cell_umap, on = "CellID")
    umap_ordered.head()
    celltype_ordered = sample_one_index.merge(cell_celltype, on = "CellID")
    celltype_ordered.head()
    clusters_ordered = sample_one_index.merge(cell_clusters, on = "CellID")
    clusters_ordered.head()
    #将umap坐标与cluster信息加入sample_one
    umap_ordered = umap_ordered.iloc[:,1:]
    clusters_ordered = clusters_ordered.iloc[:,1:]
    celltype_ordered = celltype_ordered.iloc[:,1:]
    sample_one.obsm['X_umap'] = umap_ordered.values
    sample_one.uns['clusters'] = clusters_ordered.values
    sample_one.obs['celltype'] = celltype_ordered.values
    
    adata = sample_one
    # some gene labels are duplicated (Ensembl IDs are still unique!!)
    adata.var_names_make_unique()
    
    # save model to file
    adata.write('bl_celltype_dynamicModel.h5ad', compression = 'gzip')
    
    # read h5ad file
    adata= scv.read('mbc_celltype_dynamicModel.h5ad')
    #可视化
    # Running RNA Velocity
    scv.pp.filter_and_normalize(adata,min_shared_counts=30, n_top_genes=2000)
    scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
    scv.tl.velocity(adata, mode = "stochastic")
    scv.tl.velocity_graph(adata)
    
    scv.pl.velocity_embedding(adata, basis='X_umap', arrow_size=5)
    ident_colours = ["#B2DF8A", "#FB9A99", "#33A02C", "#A6CEE3", "#1F78B4"]
    scv.pl.velocity_embedding_stream(adata, basis='X_umap',color = "celltype", palette = ident_colours, size = 20,alpha =0.8, save="UMAP_stream.pdf", figsize=(7,5), legend_fontsize = 9, show=False, title='')
    

    Python比R对于RNA速率分析的软件更兼容,虽然我python基础非常之薄弱,但是基本代码原理都是相同的,但之后还是好好学习python了。

    相关文章

      网友评论

        本文标题:非模式物种的RNA速率分析(二)

        本文链接:https://www.haomeiwen.com/subject/dridbrtx.html