最近在做一个分析的时候,需要使用到python,而我的单细胞数据是seurat对象,所以需要转化一下,变成h5ad。关于seurat转h5ad的方法,我们在PAGA那一部分进行过总结,还没有看的可以在打包代码PAGA里面找找。那么这次,我依然时使用之前的代码,选择用sceasy进行转化,这个我认为是最简洁的一个方法。可是,这次我用的是V5,报错了!首先我们安装下sceasy,它的使用需要借助python环境,首先我们在终端创建环境,R里面调用,安装sceasy,需要用到的有loompy。
#创建环境(终端)
conda create -y -n sceasy python=3.7
conda activate sceasy
conda install anndata -c bioconda
#R中操作
#安装R包,sceasy可以实现多种数据形式的转化,https://github.com/cellgeni/sceasy
devtools::install_github("cellgeni/sceasy")
library(sceasy)
library(reticulate)
use_condaenv('sceasy')
loompy <- reticulate::import('loompy')
sceasy转化只需一句代码,可是出现了错误,很明显是V5的事!
objV5 <- sce_cca
sceasy::convertFormat(objV5, from="seurat", to="anndata", outFile='objV5.h5ad')
#运行日志报错
Error in ncol(df) :
no slot of name "meta.features" for this object of class "Assay5"
那么一种方法就是等待相关的包更新以适用于V5,这其实很坐以待毙,那么可行的方式就是转化降低,还比如别人的V5对象,你想操作,但是不熟悉,那么就需要转化为V4可能更好用一点。V4、V5区别没有达到“天差地别”,不像monocle2和monocle3那样的转折。操作很简单,as一句代码。转化后,sceay的转阿虎操作也没有问题了!
objV5[["RNA"]] <- as(objV5[["RNA"]], "Assay")
sceasy::convertFormat(objV5, from="seurat", to="anndata", outFile='objV5.h5ad')
#运行日志
# ... storing 'orig.ident' as categorical
# ... storing 'celltype' as categorical
# ... storing 'metacell_group' as categorical
# ... storing 'var.features' as categorical
# AnnData object with n_obs × n_vars = 7490 × 23700
# obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'percent.mt', 'percent.hb', 'percent.rb', 'RNA_snn_res.0.1', 'RNA_snn_res.0.2', 'RNA_snn_res.0.3', 'RNA_snn_res.0.4', 'RNA_snn_res.0.5', 'RNA_snn_res.0.6', 'RNA_snn_res.0.7', 'RNA_snn_res.0.8', 'RNA_snn_res.0.9', 'RNA_snn_res.1', 'seurat_clusters', 'celltype', 'metacell_group'
# var: 'vf_vst_counts.WT_mean', 'vf_vst_counts.WT_variance', 'vf_vst_counts.WT_variance.expected', 'vf_vst_counts.WT_variance.standardized', 'vf_vst_counts.WT_variable', 'vf_vst_counts.WT_rank', 'vf_vst_counts.GO_mean', 'vf_vst_counts.GO_variance', 'vf_vst_counts.GO_variance.expected', 'vf_vst_counts.GO_variance.standardized', 'vf_vst_counts.GO_variable', 'vf_vst_counts.GO_rank', 'var.features', 'var.features.rank'
# obsm: 'X_pca', 'X_integrated.cca', 'X_umap'
我们将转化后的数据读入python,看看有没有问题!
sce = anndata.read_h5ad('objV5.h5ad')
#plot
sc.pl.umap(sce, color="celltype")
sc.pl.DotPlot(sce, ["Pparg", "Myh11", "Mrc1", "Flt1", "Col11a1", "Mymk", "Pax7", "Pdgfra","Ttn","Sox2"],
log = True, groupby='celltype').style(cmap='PRGn',dot_edge_color='black', dot_edge_lw=1).swap_axes(False).show(True)
image.png
做一下差异基因分析,也是没有毛病:
sc.pp.log1p(sce)
sc.pp.scale(sce)
sc.tl.rank_genes_groups(sce, 'orig.ident', method='wilcoxon')
那么最后,检验一下seurat V5转4之后做分析有没有问题,我选择的是用scMetabolism进行测试,我们知道,scMetabolism是不适用于V5的。但是转化之后,分析这个scMetabolism流程就没有问题了!
library(scMetabolism)
V5_metabolism<-sc.metabolism.Seurat(obj = sce_cca,
method = "AUCell",
imputation = F,
ncores = 2,
metabolism.type = "KEGG")
#运行日志
Error in sc.metabolism.Seurat(obj = sce_cca, method = "AUCell", imputation = F, :
no slot of name "counts" for this object of class "Assay5"
V4_metabolism<-sc.metabolism.Seurat(obj = objV5_trans,
method = "AUCell",
imputation = F,
ncores = 2,
metabolism.type = "KEGG")
#运行日志
# Your choice is: KEGG
# Start quantify the metabolism activity...
# Genes in the gene sets NOT available in the dataset:
# Glycolysis / Gluconeogenesis: 8 (12% of 68)
# Citrate cycle (TCA cycle): 1 (3% of 30)
# Pentose phosphate pathway: 4 (13% of 30)
# Pentose and glucuronate interconversions: 13 (38% of 34)
# Fructose and mannose metabolism: 2 (6% of 33)
# Galactose metabolism: 3 (10% of 31)
# Ascorbate and aldarate metabolism: 13 (48% of 27)
# Starch and sucrose metabolism: 8 (22% of 36)
# Amino sugar and nucleotide sugar metabolism: 2 (4% of 48)
# Pyruvate metabolism: 2 (5% of 39)
# Glyoxylate and dicarboxylate metabolism: 3 (11% of 28)
# Propanoate metabolism: 2 (6% of 32)
# Butanoate metabolism: 5 (18% of 28)
# Inositol phosphate metabolism: 2 (3% of 73)
# Oxidative phosphorylation: 35 (26% of 133)
# Nitrogen metabolism: 1 (6% of 17)
# Fatty acid elongation: 1 (3% of 30)
# Fatty acid degradation: 4 (9% of 44)
# Synthesis and degradation of ketone bodies: 1 (10% of 10)
# Steroid biosynthesis: 2 (11% of 19)
# Primary bile acid biosynthesis: 3 (18% of 17)
# Steroid hormone biosynthesis: 23 (39% of 59)
# Glycerolipid metabolism: 6 (10% of 61)
# Glycerophospholipid metabolism: 8 (8% of 97)
# Ether lipid metabolism: 7 (15% of 47)
# Sphingolipid metabolism: 2 (4% of 47)
# Arachidonic acid metabolism: 16 (26% of 62)
# Linoleic acid metabolism: 11 (38% of 29)
# alpha-Linolenic acid metabolism: 7 (28% of 25)
# Biosynthesis of unsaturated fatty acids: 1 (4% of 23)
# Purine metabolism: 12 (7% of 174)
# Pyrimidine metabolism: 7 (7% of 101)
# Alanine, aspartate and glutamate metabolism: 2 (6% of 35)
# Glycine, serine and threonine metabolism: 3 (8% of 40)
# Cysteine and methionine metabolism: 4 (9% of 45)
# Valine, leucine and isoleucine degradation: 2 (4% of 48)
# Lysine degradation: 1 (2% of 59)
# Arginine biosynthesis: 3 (14% of 21)
# Arginine and proline metabolism: 3 (6% of 50)
# Histidine metabolism: 1 (4% of 23)
# Tyrosine metabolism: 6 (17% of 36)
# Phenylalanine metabolism: 2 (12% of 17)
# Tryptophan metabolism: 4 (10% of 40)
# Phenylalanine, tyrosine and tryptophan biosynthesis: 1 (20% of 5)
# Taurine and hypotaurine metabolism: 1 (9% of 11)
# Selenocompound metabolism: 1 (6% of 17)
# D-Glutamine and D-glutamate metabolism: 1 (20% of 5)
# Glutathione metabolism: 10 (18% of 56)
# N-Glycan biosynthesis: 3 (6% of 49)
# Mucin type O-glycan biosynthesis: 4 (13% of 31)
# Mannose type O-glycan biosynthesis: 2 (9% of 23)
# Glycosaminoglycan biosynthesis - keratan sulfate: 1 (7% of 14)
# Glycosphingolipid biosynthesis - lacto and neolacto series: 5 (19% of 27)
# Other glycan degradation: 1 (6% of 18)
# Thiamine metabolism: 4 (25% of 16)
# Nicotinate and nicotinamide metabolism: 1 (3% of 30)
# Pantothenate and CoA biosynthesis: 1 (5% of 19)
# Folate biosynthesis: 5 (19% of 26)
# One carbon pool by folate: 2 (10% of 20)
# Retinol metabolism: 25 (38% of 66)
# Porphyrin and chlorophyll metabolism: 11 (26% of 42)
# Ubiquinone and other terpenoid-quinone biosynthesis: 1 (9% of 11)
# Caffeine metabolism: 2 (40% of 5)
# Metabolism of xenobiotics by cytochrome P450: 30 (40% of 75)
# Drug metabolism - cytochrome P450: 27 (38% of 71)
# Drug metabolism - other enzymes: 21 (27% of 79)
# Warning messages:
# 1: In .AUCell_buildRankings(exprMat = exprMat, featureType = featureType, :
# nCores is no longer used. It will be deprecated in the next AUCell version.
# 2: useNames = NA is deprecated. Instead, specify either useNames = TRUE or useNames = TRUE.
input.pathway<-c("Glycolysis / Gluconeogenesis", "Oxidative phosphorylation", "Citrate cycle (TCA cycle)")
DotPlot.metabolism(obj = V4_metabolism, pathway = input.pathway, phenotype = "celltype", norm = "y")
image.png
最后感谢github上各路大神的分享,以及讨论探究,觉得分享有用的点个赞再走呗!
网友评论