单细胞数据挖掘实战:文献复现(二)批量创建Seurat对象及质控
前面对细胞进行了注释,在肿瘤样本中发现有很多Mo/MΦ细胞,这时需要借助图形来直观的表达,下面就来尝试画一下文献中的Fig. 1d。
一、加载R包
if(T){
if(!require(BiocManager))install.packages("BiocManager")
if(!require(Seurat))install.packages("Seurat")
if(!require(Matrix))install.packages("Matrix")
if(!require(ggplot2))install.packages("ggplot2")
if(!require(cowplot))install.packages("cowplot")
if(!require(magrittr))install.packages("magrittr")
if(!require(dplyr))install.packages("dplyr")
if(!require(purrr))install.packages("purrr")
if(!require(ggrepel))install.packages("ggrepel")
if(!require(ggpubr))install.packages("ggpubr")
}
二、读入数据
sex_condition_objects = readRDS("sex_condition_objects.RDS")
三、将细胞注释结果整理成一个EXCEL表并读入
前面得到了四个样本的注释结果,将它们整理成一个excel表,部分截图如下
1.png注意这里cluster列和cell_type的命名需按照截图里的规则,不然后面的代码会报错,当然也可以根据自己的命名修改后面的代码。
cell_types<-read.csv("./anno_cell/cell_type_index.csv", header = T)
四、在sex_condition_objects中添加细胞类型
sex_condition_objects <- lapply(sex_condition_objects, function(x) {
x$full_cluster_id <- paste(substring(x$shortID,12,12), x$condition, Idents(x), sep="_")
x$cell_type <- cell_types[match(x$full_cluster_id, cell_types$cluster), "cell_type"]
x$cell_type <- factor(x$cell_type, levels= c("micro", "pre-micro", "macro", "BAM", "NKT", "NK","B-cells", "T-cells","Ncam1+", "DC", "other"))
x$cell_type_selection <- ""
x$cell_type_selection[x$cell_type %in% c("micro", "pre-micro")] <- "Microglia"
x$cell_type_selection[x$cell_type == "macro"] <- "Macrophages"
x$cell_type_selection[x$cell_type == "BAM"] <- "BAM"
x
})
五、画图
# Figure 1d(Pie charts)
# 定义细胞的颜色
micro<-"#53AFE6"
pre_micro<-"#2DA7C8"
BAM<- "#0DD1AD"
UN<-"grey"
Mo<-"#FCE80C"
Mo_Mg<-"#FABF00"
Mg<-"#E98934"
NK<-"#8c42a3"
ncam<-"#C2B4FC"
NKT<-"#DFA5F2"
DC<-"#bf7a58"
Tcells<-"#94112f"
Bcells<-"#EC5CA5"
freq_list <- lapply(sex_condition_objects, function(x) {
freq <- data.frame(cell_type = x$cell_type)
freq <- freq %>%
group_by(cell_type) %>%
count() %>%
ungroup %>%
mutate(per = `n`/sum(`n`))
freq$cell_type <- factor(freq$cell_type, levels= c("micro", "pre-micro", "macro", "BAM", "NKT","NK", "B-cells", "T-cells","Ncam1+", "DC", "other"))
freq$label <- scales::percent(freq$per)
freq
})
cf<-ggplot(freq_list$`GSM4039241-F-ctrl`,
aes(x="", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, pre_micro, BAM))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
cm<-ggplot(freq_list$`GSM4039245-M-ctrl`,
aes(x=" ", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, pre_micro, BAM, NK, DC,UN))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
tf<-ggplot(freq_list$`GSM4039243-F-tumor`,
aes(x="", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, Mo_Mg, BAM, NKT, NK, Bcells, Tcells, ncam, DC,UN))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
tm<-ggplot(freq_list$`GSM4039247-M-tumor`,
aes(x="", y=per, fill=cell_type))+
geom_bar(stat="identity", width=1, color="white")+
coord_polar("y", start=0)+
scale_fill_manual(values=c(micro, Mo_Mg, BAM, NKT, Bcells,Tcells, DC, UN ))+
theme_light()+
geom_label_repel(aes(label = label), size=3, show.legend = F, nudge_x = 1)
pdf(file = "pie.pdf",width = 20,height = 10)
ggarrange(cf, cm, tf, tm, ncol = 4)
dev.off()
与文献中的图比较一下
2.png 3.png
每种细胞的比例跟文献中基本保持一致,在肿瘤样本中,MG仍然是最丰富的细胞群,但比例有所下降,出现了很多其它种类的细胞,这也就是肿瘤的异质性。
往期单细胞数据挖掘实战:
网友评论