单细胞seurat V5转V4格式及seurat转为h5ad格式

作者: KS科研分享与服务 | 来源:发表于2024-09-18 14:18 被阅读0次

seurat 格式
SpatialExperiment || 空间数据管理
从Scanpy的Anndata对象提取信息（适用于单细胞转录组且
【单细胞测序数据分析-1】认识Seurat对象数据结构/数据格式
单细胞实战2: Seurat+SingleR--从矩阵到细胞类型
Seurat包之导入单细胞数据方式汇总
2022-06-17 单细胞数据转换
Harmony整合单细胞数
sc-RAN-seq 数据分析||Seurat新版教程:Guid
Converting from Seurat to AnnDat

最近在做一个分析的时候，需要使用到python，而我的单细胞数据是seurat对象，所以需要转化一下，变成h5ad。关于seurat转h5ad的方法，我们在PAGA那一部分进行过总结，还没有看的可以在打包代码PAGA里面找找。那么这次，我依然时使用之前的代码，选择用sceasy进行转化，这个我认为是最简洁的一个方法。可是，这次我用的是V5，报错了！首先我们安装下sceasy，它的使用需要借助python环境，首先我们在终端创建环境，R里面调用，安装sceasy，需要用到的有loompy。


#创建环境（终端）
conda create -y -n sceasy python=3.7
conda activate sceasy
conda install anndata -c bioconda
#R中操作
#安装R包，sceasy可以实现多种数据形式的转化，https://github.com/cellgeni/sceasy
devtools::install_github("cellgeni/sceasy")
library(sceasy)
library(reticulate)
use_condaenv('sceasy')
loompy <- reticulate::import('loompy')

sceasy转化只需一句代码，可是出现了错误，很明显是V5的事！


objV5 <- sce_cca
sceasy::convertFormat(objV5, from="seurat", to="anndata", outFile='objV5.h5ad')
#运行日志报错
Error in ncol(df) : 
  no slot of name "meta.features" for this object of class "Assay5"

那么一种方法就是等待相关的包更新以适用于V5，这其实很坐以待毙，那么可行的方式就是转化降低，还比如别人的V5对象，你想操作，但是不熟悉，那么就需要转化为V4可能更好用一点。V4、V5区别没有达到“天差地别”，不像monocle2和monocle3那样的转折。操作很简单，as一句代码。转化后，sceay的转阿虎操作也没有问题了！


objV5[["RNA"]] <- as(objV5[["RNA"]], "Assay")
sceasy::convertFormat(objV5, from="seurat", to="anndata", outFile='objV5.h5ad')

#运行日志
# ... storing 'orig.ident' as categorical
# ... storing 'celltype' as categorical
# ... storing 'metacell_group' as categorical
# ... storing 'var.features' as categorical
# AnnData object with n_obs × n_vars = 7490 × 23700 
# obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'percent.mt', 'percent.hb', 'percent.rb', 'RNA_snn_res.0.1', 'RNA_snn_res.0.2', 'RNA_snn_res.0.3', 'RNA_snn_res.0.4', 'RNA_snn_res.0.5', 'RNA_snn_res.0.6', 'RNA_snn_res.0.7', 'RNA_snn_res.0.8', 'RNA_snn_res.0.9', 'RNA_snn_res.1', 'seurat_clusters', 'celltype', 'metacell_group'
# var: 'vf_vst_counts.WT_mean', 'vf_vst_counts.WT_variance', 'vf_vst_counts.WT_variance.expected', 'vf_vst_counts.WT_variance.standardized', 'vf_vst_counts.WT_variable', 'vf_vst_counts.WT_rank', 'vf_vst_counts.GO_mean', 'vf_vst_counts.GO_variance', 'vf_vst_counts.GO_variance.expected', 'vf_vst_counts.GO_variance.standardized', 'vf_vst_counts.GO_variable', 'vf_vst_counts.GO_rank', 'var.features', 'var.features.rank'
# obsm: 'X_pca', 'X_integrated.cca', 'X_umap'

我们将转化后的数据读入python，看看有没有问题！

sce = anndata.read_h5ad('objV5.h5ad')

#plot
sc.pl.umap(sce, color="celltype")
sc.pl.DotPlot(sce, ["Pparg", "Myh11", "Mrc1", "Flt1", "Col11a1", "Mymk", "Pax7", "Pdgfra","Ttn","Sox2"], 
              log = True, groupby='celltype').style(cmap='PRGn',dot_edge_color='black', dot_edge_lw=1).swap_axes(False).show(True)

image.png

做一下差异基因分析，也是没有毛病：


sc.pp.log1p(sce)
sc.pp.scale(sce)
sc.tl.rank_genes_groups(sce, 'orig.ident', method='wilcoxon')

那么最后，检验一下seurat V5转4之后做分析有没有问题，我选择的是用scMetabolism进行测试，我们知道，scMetabolism是不适用于V5的。但是转化之后，分析这个scMetabolism流程就没有问题了！


library(scMetabolism)
V5_metabolism<-sc.metabolism.Seurat(obj = sce_cca, 
                                      method = "AUCell", 
                                      imputation = F, 
                                      ncores = 2, 
                                      metabolism.type = "KEGG")


#运行日志
Error in sc.metabolism.Seurat(obj = sce_cca, method = "AUCell", imputation = F,  : 
  no slot of name "counts" for this object of class "Assay5"
V4_metabolism<-sc.metabolism.Seurat(obj = objV5_trans, 
                                    method = "AUCell", 
                                    imputation = F, 
                                    ncores = 2, 
                                    metabolism.type = "KEGG")
#运行日志

# Your choice is: KEGG
# Start quantify the metabolism activity...
# Genes in the gene sets NOT available in the dataset: 
#   Glycolysis / Gluconeogenesis:   8 (12% of 68)
# Citrate cycle (TCA cycle):   1 (3% of 30)
# Pentose phosphate pathway:   4 (13% of 30)
# Pentose and glucuronate interconversions:   13 (38% of 34)
# Fructose and mannose metabolism:   2 (6% of 33)
# Galactose metabolism:   3 (10% of 31)
# Ascorbate and aldarate metabolism:   13 (48% of 27)
# Starch and sucrose metabolism:   8 (22% of 36)
# Amino sugar and nucleotide sugar metabolism:   2 (4% of 48)
# Pyruvate metabolism:   2 (5% of 39)
# Glyoxylate and dicarboxylate metabolism:   3 (11% of 28)
# Propanoate metabolism:   2 (6% of 32)
# Butanoate metabolism:   5 (18% of 28)
# Inositol phosphate metabolism:   2 (3% of 73)
# Oxidative phosphorylation:   35 (26% of 133)
# Nitrogen metabolism:   1 (6% of 17)
# Fatty acid elongation:   1 (3% of 30)
# Fatty acid degradation:   4 (9% of 44)
# Synthesis and degradation of ketone bodies:   1 (10% of 10)
# Steroid biosynthesis:   2 (11% of 19)
# Primary bile acid biosynthesis:   3 (18% of 17)
# Steroid hormone biosynthesis:   23 (39% of 59)
# Glycerolipid metabolism:   6 (10% of 61)
# Glycerophospholipid metabolism:   8 (8% of 97)
# Ether lipid metabolism:   7 (15% of 47)
# Sphingolipid metabolism:   2 (4% of 47)
# Arachidonic acid metabolism:   16 (26% of 62)
# Linoleic acid metabolism:   11 (38% of 29)
# alpha-Linolenic acid metabolism:   7 (28% of 25)
# Biosynthesis of unsaturated fatty acids:   1 (4% of 23)
# Purine metabolism:   12 (7% of 174)
# Pyrimidine metabolism:   7 (7% of 101)
# Alanine, aspartate and glutamate metabolism:   2 (6% of 35)
# Glycine, serine and threonine metabolism:   3 (8% of 40)
# Cysteine and methionine metabolism:   4 (9% of 45)
# Valine, leucine and isoleucine degradation:   2 (4% of 48)
# Lysine degradation:   1 (2% of 59)
# Arginine biosynthesis:   3 (14% of 21)
# Arginine and proline metabolism:   3 (6% of 50)
# Histidine metabolism:   1 (4% of 23)
# Tyrosine metabolism:   6 (17% of 36)
# Phenylalanine metabolism:   2 (12% of 17)
# Tryptophan metabolism:   4 (10% of 40)
# Phenylalanine, tyrosine and tryptophan biosynthesis:   1 (20% of 5)
# Taurine and hypotaurine metabolism:   1 (9% of 11)
# Selenocompound metabolism:   1 (6% of 17)
# D-Glutamine and D-glutamate metabolism:   1 (20% of 5)
# Glutathione metabolism:   10 (18% of 56)
# N-Glycan biosynthesis:   3 (6% of 49)
# Mucin type O-glycan biosynthesis:   4 (13% of 31)
# Mannose type O-glycan biosynthesis:   2 (9% of 23)
# Glycosaminoglycan biosynthesis - keratan sulfate:   1 (7% of 14)
# Glycosphingolipid biosynthesis - lacto and neolacto series:   5 (19% of 27)
# Other glycan degradation:   1 (6% of 18)
# Thiamine metabolism:   4 (25% of 16)
# Nicotinate and nicotinamide metabolism:   1 (3% of 30)
# Pantothenate and CoA biosynthesis:   1 (5% of 19)
# Folate biosynthesis:   5 (19% of 26)
# One carbon pool by folate:   2 (10% of 20)
# Retinol metabolism:   25 (38% of 66)
# Porphyrin and chlorophyll metabolism:   11 (26% of 42)
# Ubiquinone and other terpenoid-quinone biosynthesis:   1 (9% of 11)
# Caffeine metabolism:   2 (40% of 5)
# Metabolism of xenobiotics by cytochrome P450:   30 (40% of 75)
# Drug metabolism - cytochrome P450:   27 (38% of 71)
# Drug metabolism - other enzymes:   21 (27% of 79)
# Warning messages:
#   1: In .AUCell_buildRankings(exprMat = exprMat, featureType = featureType,  :
#                                 nCores is no longer used. It will be deprecated in the next AUCell version.
#                               2: useNames = NA is deprecated. Instead, specify either useNames = TRUE or useNames = TRUE. 


input.pathway<-c("Glycolysis / Gluconeogenesis", "Oxidative phosphorylation", "Citrate cycle (TCA cycle)")
DotPlot.metabolism(obj = V4_metabolism, pathway = input.pathway, phenotype = "celltype", norm = "y")

image.png

最后感谢github上各路大神的分享，以及讨论探究，觉得分享有用的点个赞再走呗！

seurat 格式
SpatialExperiment || 空间数据管理
我们知道单细胞数据的激增和它的高维特征催生了针对单细胞数据的数据格式：Seurat对象、singlecellexp...
从Scanpy的Anndata对象提取信息（适用于单细胞转录组且
为什么要转格式目前使用最为广泛的2个单细胞分析的软件包是基于R语言的Seurat和基于Python的Scanpy。...
【单细胞测序数据分析-1】认识Seurat对象数据结构/数据格式
一文了解单细胞对象数据结构/数据格式，单细胞数据操作不迷茫。本文内容包括单细胞seurat对象数据结构，内容构...
单细胞实战2: Seurat+SingleR--从矩阵到细胞类型
单细胞数据标准分析Seurat流程： Seurat官网：https://satijalab.org/seurat/...
Seurat包之导入单细胞数据方式汇总
挖掘公共单细胞数据集时，会遇到常见各种单细胞测序数据格式。现总结如下，方便自己日后调用，以创建Seurat对象（1...
2022-06-17 单细胞数据转换
Seurat 格式转换成SingleCellExperimentlibrary(sceasy)sceasy::co...
Harmony整合单细胞数
Seurat结合Harmony的整合流程参考：单细胞测序分析: Seurat V3联合harmony进行单细胞数...
sc-RAN-seq 数据分析||Seurat新版教程:Guid
Seurat - Guided Clustering Tutorial Seurat，一个单细胞数据分析工具箱十...
Converting from Seurat to AnnDat
从rds格式到h5格式https://mojaveazure.github.io/seurat-disk/arti...