10、DA(Differential abundance)

作者: 小贝学生信 | 来源:发表于2020-10-06 11:34 被阅读0次

10、DA(Differential abundance)
跟着Nature Communications学数据分析：R语言
跟着Nature Communications学作图：R语言ph
Abundance
Rank abundance
噩梦是梦的代价，也是梦的证明
[TheCottageFairy]通过简朴生活培养生命丰盛感的1
143/200
留学美国，海量奖学金怎么拿？
甘蓝泛基因组中预测的抗性基因丰度的变化

原文链接
 9、DEG(Differential expressed genes) - 简书

之前DEG分析时在注释celltype的基础上比较condition之间的相同cellty的差异基因。DA分析同样是基于注释celltype，比较不同conditions的相同的celltype的cells数有无显著差异。

test for significant changes in per-label cell abundance across conditions；
reveal which cell types are depleted or enriched upon treatment.

1、preparation "count matrics"

quantify the number of cells assigned to each label (or cluster).
identify labels that change in abundance among the compartment of injected cells compared to the background.

load("merged.Rdata")
merged
abundances <- table(merged$celltype.mapped, merged$sample) 
#六个sample的34个celltype的cells number
abundances <- unclass(abundances) 
class(abundances)
head(abundances)

1-1

2、

接下来的分析流程同样是使用edgeR的分析流程
唯一的区别就是the counts are not of reads per gene, but of cells per label

2.1 creat DEGList

extra.info <- colData(merged)[match(colnames(abundances), merged$sample),]
library(edgeR)
y.ab <- DGEList(counts=abundances, samples=extra.info)
y.ab

2-1

2.2 filter out low-abundance labels

keep <- filterByExpr(y.ab, group=y.ab$samples$tomato)
summary(keep)
#去除10个celltype（10行）
y.ab <- y.ab[keep,]

2.3 give design

design <- model.matrix(~factor(pool) + factor(tomato), y.ab$samples)
design

2-2

2.4 DA analysis

#estimateDisp() function to estimate the NB dipersion for each cluster
y.ab <- estimateDisp(y.ab, design, trend="none")
#turn off the trend as we do not have enough points for its stable estimation.
summary(y.ab$common.dispersion)
plotBCV(y.ab, cex=1)
#QL dispersion
fit.ab <- glmQLFit(y.ab, design, robust=TRUE, abundance.trend=FALSE)
summary(fit.ab$var.prior)
summary(fit.ab$df.prior)
plotQLDisp(fit.ab, cex=1)
#test for differences in abundance between td-Tomato-positive and negative samples using glmQLFTest().
res <- glmQLFTest(fit.ab, coef=ncol(design))
summary(decideTests(res))
topTags(res)

如下图结果
（1）extra-embryonic ectoderm is strongly depleted in the injected cells.
（2）This is consistent with the expectation that cells injected into the blastocyst should not contribute to extra-embryonic tissue.
（3）The injected cells also contribute more to the mesenchyme, which may also be of interest.

2-4

以上是第十五章differential-expression-between-conditions第二部分的简单流程笔记，主要学习了single cell DA分析详见Chapter 14 Multi-sample comparisons
本系列笔记基于OSCA全流程的大致流程梳理，详细原理可参考原文。如有错误，恳请指正！
此外还有刘小泽老师整理的全文翻译笔记，详见目录。