原文链接
9、DEG(Differential expressed genes) - 简书
之前DEG分析时在注释celltype的基础上比较condition之间的相同cellty的差异基因。DA分析同样是基于注释celltype,比较不同conditions的相同的celltype的cells数有无显著差异。
- test for significant changes in per-label cell abundance across conditions;
- reveal which cell types are depleted or enriched upon treatment.
1、preparation "count matrics"
- quantify the number of cells assigned to each label (or cluster).
- identify labels that change in abundance among the compartment of injected cells compared to the background.
load("merged.Rdata")
merged
abundances <- table(merged$celltype.mapped, merged$sample)
#六个sample的34个celltype的cells number
abundances <- unclass(abundances)
class(abundances)
head(abundances)
1-1
2、
- 接下来的分析流程同样是使用edgeR的分析流程
- 唯一的区别就是the counts are not of reads per gene, but of cells per label
2.1 creat DEGList
extra.info <- colData(merged)[match(colnames(abundances), merged$sample),]
library(edgeR)
y.ab <- DGEList(counts=abundances, samples=extra.info)
y.ab
2-1
2.2 filter out low-abundance labels
keep <- filterByExpr(y.ab, group=y.ab$samples$tomato)
summary(keep)
#去除10个celltype(10行)
y.ab <- y.ab[keep,]
2.3 give design
design <- model.matrix(~factor(pool) + factor(tomato), y.ab$samples)
design
2-2
2.4 DA analysis
#estimateDisp() function to estimate the NB dipersion for each cluster
y.ab <- estimateDisp(y.ab, design, trend="none")
#turn off the trend as we do not have enough points for its stable estimation.
summary(y.ab$common.dispersion)
plotBCV(y.ab, cex=1)
#QL dispersion
fit.ab <- glmQLFit(y.ab, design, robust=TRUE, abundance.trend=FALSE)
summary(fit.ab$var.prior)
summary(fit.ab$df.prior)
plotQLDisp(fit.ab, cex=1)
#test for differences in abundance between td-Tomato-positive and negative samples using glmQLFTest().
res <- glmQLFTest(fit.ab, coef=ncol(design))
summary(decideTests(res))
topTags(res)
-
如下图结果
(1)extra-embryonic ectoderm is strongly depleted in the injected cells.
(2)This is consistent with the expectation that cells injected into the blastocyst should not contribute to extra-embryonic tissue.
(3)The injected cells also contribute more to the mesenchyme, which may also be of interest.
2-4
以上是第十五章differential-expression-between-conditions第二部分的简单流程笔记,主要学习了single cell DA分析详见Chapter 14 Multi-sample comparisons
本系列笔记基于OSCA全流程的大致流程梳理,详细原理可参考原文。如有错误,恳请指正!
此外还有刘小泽老师整理的全文翻译笔记,详见目录。
网友评论