6、Marker gene detection

作者: 小贝学生信 | 来源:发表于2020-09-26 13:23 被阅读0次

6、Marker gene detection
Seurat V3 学习(二)
Cell Marker
网络安全-Day50-Gene6 FTP安装
网络安全-Day51-Gene6提权
单细胞测序细胞类型鉴定
单细胞类型鉴定总结
标记基因，报告基因傻傻分不清?
单细胞marker基因平均表达量热图
Android 百度地图SDK 清除已经添加的Marker

1、目的

we identify the genes that drive separation between clusters. 鉴别直接导致分群结果的marker gene.
In the most obvious case, the marker genes for each cluster are a priori associated with particular cell types .即 marker gene 与后面的细胞类型注释直接相关
different approaches can be used to consolidate test results into a single ranking of genes for each cluster.
本文主要学习scran包的findmarkers函数其中的t-test方法。

load("fluidigm.clust.RData") 
#加载上一步分析结果
fluidigm.clust
library(scran)

2、`findMarkers`

（1）基础用法

markers.fluidigm <- findMarkers(fluidigm.clust,groups=fluidigm.clust$label)
markers.fluidigm
markers.fluidigm$`1`

如上，findMarkers的基本参数为sce对象，clust信息。
其中比较方法默认为t.test

2-1

关于`Top`列

首先要知道：t.test是基于所有的clust两两之间比较的，而不是一个clust与其它所有clust的均值作比较。
而Top列，以1为例，表示clust1分别与另外三个clust相比差异最显著top1的基因（ranked by p-value）

chosen <- "1"  #查看clust1与其它三个clust的比较
interesting <- markers.fluidigm[[chosen]]
colnames(interesting)
interesting[1:9,1:4]

2-2

热图可视化

best.set <- interesting[interesting$Top <= 3,]
#选取与其它三个clust差异排名前三，共9个基因的矩阵
getMarkerEffects <- function(x, prefix="logFC", strip=TRUE, remove.na.col=FALSE) {
  #prefix="logFC"
  #x=best.set
  regex <- paste0("^", prefix, "\\.")
  i <- grep(regex, colnames(x))
  out <- as.matrix(x[,i])
  
  if (strip) {
    colnames(out) <- sub(regex, "", colnames(out))
  }
  if (remove.na.col) {
    out <- out[,!colAnyNAs(out),drop=FALSE]
  }
  
  out
}#不知为啥下载scran包没有这个函数，找到了函数源代码，同样效果
logFCs <- getMarkerEffects(best.set)
#得到行名为基因，列名为其它三类clust与clust1de1 logFC
library(pheatmap)
pheatmap(logFCs, breaks=seq(-5, 5, length.out=101))

2-3

（2）细节筛选

基于logFC

#仅挑选上调基因
markers.fluidigm.up <- findMarkers(fluidigm.clust, direction="up",
                                   groups=fluidigm.clust$label)
interesting.up <- markers.fluidigm.up[[chosen]]
interesting.up[1:10,1:4]

#logfc>1
markers.fluidigm.up2 <- findMarkers(fluidigm.clust, direction="up", lfc=1,
                                    groups=fluidigm.clust$label)
interesting.up2 <- markers.fluidigm.up2[[chosen]]
interesting.up2[1:10,1:4]

These two settings yield a more focused set of candidate marker genes that are upregulated in cluster 1

best.set <- interesting.up2[interesting.up2$Top <= 5,]
logFCs <- getMarkerEffects(best.set)
library(pheatmap)
pheatmap(logFCs, breaks=seq(-5, 5, length.out=101))

2-4

基于比较方法

如前所述，基础用法是所有clust两两比较；
可以通过设置参数pval.type=修改p.value的比较关系

（1）pval.type="all"

A more stringent approach would only consider genes that are differentially expressed in all pairwise comparisons involving the cluster of interest.
即clust1的某gene必须与其它所有clust表达都有差异才认定为DE

markers.fluidigm.up3 <- findMarkers(fluidigm.clust, pval.type="all", 
                                direction="up",
                                groups=fluidigm.clust$label)
interesting.up3 <- markers.fluidigm.up3[[chosen]]
head(interesting.up3)

这种方法看起来很合理，但是也有缺点

此时的p.value为the maximum of the p-values from all pairwise comparisons.
A gene will only achieve a low combined p-value if it is strongly DE in all comparisons to other clusters.
However, any gene that is expressed at the same level in two or more clusters will simply not be detected.
This is likely to discard many interesting genes, especially if the clusters are finely resolved with weak separation.

（2）pval.type="any"

will be DE between at least one pair of subpopulations.
too generous.

（3）pval.type="some"

at least 50% of the individual pairwise comparisons exhibit no DE
take the middle-most value as the combined p-value.
reasonable

（3）其它参数

`block=`

handle blocking factor

m.out <- findMarkers(fluidigm.clust, 
                     block=fluidigm.clust$Coverage_Type, 
                     direction="up",
                     groups=fluidigm.clust$label) 
demo <- m.out[["1"]] 
demo[demo$Top <= 5,1:4]

`test=`

默认为t.tests；
可以设定、采用其它方法，例如wilcox,binomial等；具体可参考原文，不再记录。

save(markers.fluidigm, file = "markergene.Rdata")

以上是第十一章Clustering部分的简单流程笔记，主要学习了基于t.test的marker gene detection，其它如wilcox方法见原文Chapter 10 Marker gene detection
本系列笔记基于OSCA全流程的大致流程梳理，详细原理可参考原文。如有错误，恳请指正！
此外还有刘小泽老师整理的全文翻译笔记，详见目录。

网友评论

本文标题：6、Marker gene detection

本文链接：https://www.haomeiwen.com/subject/rrscuktx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

6、Marker gene detection

1、目的

2、`findMarkers`

（1）基础用法

关于`Top`列

（2）细节筛选

基于logFC

基于比较方法

（3）其它参数

`block=`

`test=`

相关文章

6、Marker gene detection

Seurat V3 学习(二)

Cell Marker

网络安全-Day50-Gene6 FTP安装

网络安全-Day51-Gene6提权

单细胞测序细胞类型鉴定

单细胞类型鉴定总结

标记基因，报告基因傻傻分不清?

单细胞marker基因平均表达量热图

Android 百度地图SDK 清除已经添加的Marker

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

6、Marker gene detection

1、目的

2、findMarkers

（1）基础用法

关于Top列

（2）细节筛选

基于logFC

基于比较方法

（3）其它参数

block=

test=

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

2、`findMarkers`

关于`Top`列

`block=`

`test=`