美文网首页单细胞学习
Cell Type annotation---scCATCH篇

Cell Type annotation---scCATCH篇

作者: jjjscuedu | 来源:发表于2021-11-26 07:53 被阅读0次

scCATCH全称是single cell Cluster-based Annotation Toolkit for Cellular Heterogeneity,是一个用于实现单细胞转录组聚类结果进行注释的工具。软件核心函数是和scCATCH,findmarkergenes则是辅助用于寻找标记。属于marker gene based cell type annotation工具中的一种。但是缺点是目前只支持human和mouse,后台没有其它物种的库。

====安装====

devtools::install_github("ZJUFanLab/scCATCH")

===运行,先用pbmc数据试下===

测试数据下载地址:http://cf.10xgenomics.com/samples/cell-exp/3.0.2/5k_pbmc_v3/5k_pbmc_v3_filtered_feature_bc_matrix.h5

h5_file <- "5k_pbmc_v3_filtered_feature_bc_matrix.h5"

# Load the PBMC dataset

#pbmc.data <- Read10X(data.dir = "../data/pbmc3k/filtered_gene_bc_matrices/hg19/")

pbmc.data <- Read10X_h5(h5_file)  //这块和运行seurat是一样的

# Initialize the Seurat object with the raw (non-normalized data).

pbmc <- CreateSeuratObject(counts = pbmc.data, project = "pbmc3k", min.cells = 3, min.features = 200)

pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")

VlnPlot(pbmc, features = c("nFeature_RNA", "nCount_RNA", "percent.mt"), ncol = 3)

pbmc <- subset(pbmc, subset = nFeature_RNA > 500 & nFeature_RNA < 5000 & percent.mt < 20)

pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize", scale.factor = 10000)

pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000)

all.genes <- rownames(pbmc)

pbmc <- ScaleData(pbmc, features = all.genes)

pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc))

ElbowPlot(pbmc)

pbmc <- FindNeighbors(pbmc, dims = 1:10)

pbmc <- FindClusters(pbmc, resolution = 0.2)

pbmc <- RunUMAP(pbmc, dims = 1:10)

DimPlot(pbmc, label = TRUE)

//其实整个上面部分就是seurat的运行过程

接下来,使用findmarkergenes寻找每个cluster的差异基因。这一步的运行时间比较长,因为每个cluster都需要和其他的所有cluster按个比较,然后确定出当前cluster的特异基因。(其实我觉得和seurat和MAST鉴定cluster差异基因方法差不多)

clu_markers <- findmarkergenes(pbmc,species = "Human",cluster = 'All', match_CellMatch = FALSE,cancer = NULL,tissue = NULL,cell_min_pct = 0.25,logfc = 0.25,pvalue = 0.05)

clu_ann <- scCATCH(clu_markers$clu_markers,species = "Human",cancer = NULL,tissue = "Blood")

可以看出,其实就是挑出每个cluster的marker基因,然后与库中的cell type注释对比,给出一个score。

然后把相应的type label添加即可。

new.cluster.ids <- clu_ann$cell_type

names(new.cluster.ids) <- clu_ann$cluster

pbmc <- RenameIdents(pbmc, new.cluster.ids)

DimPlot(pbmc, reduction = "umap", label = TRUE, pt.size = 0.5) + NoLegend()

本文使用 文章同步助手 同步

相关文章

网友评论

    本文标题:Cell Type annotation---scCATCH篇

    本文链接:https://www.haomeiwen.com/subject/ebjtxrtx.html