找到一套感兴趣的数据,因为对单细胞测序不是很了解,提供的原始文件又不像各个教程里一样,我就很头疼的给作者发了邮件问,可不可以提供cellranger后的结果。
耶!!!!成功了,感谢马老师!
给我提供了h5ad数据
目前查到单细胞数据有两种处理方式,R的seurat包和python的scanpy。文章里是使用scanpy,不过我看seurat更多,应该是R语言处理更方便的原因把后需要看一下什么区别
本次采用seurat包来做
先把数据转为R包识别的格式后读入
remotes::install_github("mojaveazure/seurat-disk")
library(SeuratDisk)
library(patchwork)
Convert('GCdata.adata.h5ad', "h5seurat",
overwrite = TRUE,assay = "RNA")
scRNA <- LoadH5Seurat("./GCdata.adata.h5seurat")
scRNA
scRNA数据有标准10x\h5\h5ad多种格式,参考https://www.jianshu.com/p/97de1f9b7cca
不论什么格式原始数据,最终获得的是一个稀疏矩阵:行为基因名,列为barcode
> scRNA
An object of class Seurat
31053 features across 15402 samples within 1 assay
Active assay: RNA (31053 features, 0 variable features)
该数据31053个基因,15402个细胞
## Normalizing the data
library(Seurat)
scRNA <- NormalizeData(scRNA, normalization.method = "LogNormalize",
scale.factor = 10000)
scRNA <- NormalizeData(scRNA)
## Identify the 2000 most highly variable genes
scRNA <- FindVariableFeatures(scRNA, selection.method = "vst", nfeatures = 2000)
## In addition we scale the data
all.genes <- rownames(scRNA)
scRNA <- ScaleData(scRNA, features = all.genes)
scRNA <- RunPCA(scRNA, features = VariableFeatures(object = scRNA),
verbose = FALSE)
scRNA <- FindNeighbors(scRNA, dims = 1:10, verbose = FALSE)
scRNA <- FindClusters(scRNA, resolution = 0.5, verbose = FALSE)
scRNA <- RunUMAP(scRNA, dims = 1:10, umap.method = "uwot", metric = "cosine")
table(scRNA$seurat_clusters)
phe=scRNA@meta.data
save(phe,file = 'phe-by-basic-seurat.Rdata')
DimPlot(scRNA,reduction = "umap",pt.size = 1,label = T,repel = T)
虽然不知道是什么。但这是我第一幅单细胞的图
image.png
网友评论