前言

Immugent在之前的一篇推文：Sincast：利用bulk RNAseq定义单细胞群中,已经从整体上介绍了Sincast的功能，从本篇推文开始，Immugent就开始通过代码实操的形式来具体演示如何使用Sincast对我们自己的数据进行分析。

本篇推文使用的数据都是文章中的示例数据，如果想完全重复代码结果的需要从作者的github上下载相应的数据，或者也可以使用自己的数据进行分析。根据Immugent自己的经验来看，其实都差不多，确实可以达到不错的imputation效果。

废话不多说，下面直接上代码~~

代码实操

安装这个软件相对是比较容易的，如果出现报错可能是因为网络的原因，可以选择一个网络较好的时间段进行安装。

devtools::install_github('meiosis97/Sincast@main',subdir = 'pkg')
library(Sincast)

#load the query data
query.annotation <- read.table('GSE133345_Annotations_of_all_1231_embryonic_cells_updated_0620.txt')
query.data <- read.table('GSE133345_Quality_controled_UMI_data_of_all_1231_embryonic_cells.txt')

#load the reference data
reference.data <- read.table('RajabRankedExpressionMatrix.txt', check.names = F)
reference.annotation <- read.table('RajabSampleAnnotation.txt')

#Reference
all(colnames(reference.data) == rownames(reference.annotation))

#Query
all(colnames(query.data) == rownames(query.annotation))

reference <- createSce(data = reference.data, colData = reference.annotation, as.sparse = FALSE) 
query <- createSce(counts = query.data, colData = query.annotation)

Preprocess your data

query <- rcTransform(query)
reference <- featureWeighting(reference, clusterid = 'celltype')
c(reference, query) %<-% filterData(reference, query)

Build the atlas

referenceColors <- c("#081d58","#225ea8","#1d91c0","#253494","#7fcdbb","#c7e9b4","#edf8b1","#41b6c4","#7a0177","#ae017e","#49006a","#dd3497","#5e2f0d","#5e2f0d","#8b4513","#8b4513","#8b4513","#fcc5c0","#fa9fb5","#f768a1","#fdbe85","#fd8d3c","#d94701")
names(referenceColors) <-  c("kupffer cell","microglia","macrophage","monocyte","CD141+ dendritic cell","CD1c+ dendritic cell","plasmacytoid dendritic cell","dendritic cell","common myeloid progenitor","common lymphoid progenitor","granulocyte monocyte progenitor","hematopoietic multipotent progenitor","neutrophil","granulocyte","myelocyte","metamyelocyte","promyelocyte","erythrocyte","erythroblast","proerythroblast","endothelial progenitor","hemogenic endothelium","hemangioblast")

reference <- makeAtlas(reference = reference, col.by = 'celltype', colors = referenceColors,vis.atlas = T)

image.png

query <- sincastImp(query, col.by = 'cluster')
query <- postScale(query)

image.png

Projection

query <- project(reference, query)
visProjection(reference, query, colReference.by = 'celltype', referenceColors = referenceColors, colQuery.by = 'cluster')

image.png

Capybara prediction

query <- SincastCapybara(reference, query, clusterid = 'celltype', w = 'HD_mean')

visProjection(reference, query, colReference.by = 'celltype', referenceColors = referenceColors, colQuery.by = 'Cb_macrophage')

CapybaraHeatmap(query)

image.png

说在最后

虽然这个包的功能还是很不错的，但是有缺点咱们该说还是得说。首先就是这个包出的图有点丑，图的比例和显示比例非常不匀称，这样就导致画出的图不能一目了然的反应出最显著的结果。从这点也可以看出，其实开发一个优质的生信工具，不仅需要强大的统计学/数据知识，还需要很强的可视化功底。此外，Sincast使用简单是其很多功能都是通过一个函数直接实现，这有一个不好之处就在于不能个性化调试参数，把控步骤。

好啦，本期推文到这就结束啦，下一期推文Immugent将会介绍Sincast包的另一些功能，敬请期待~~