MAGeCKFlute是在MAGeCK的基础上对其结果进行展示并做下游分析,因此运行MAGeCKFlute首先要利用MAGeCK进行测序质控、基因及sgRNA富集数的统计和标准化等,主要结果文件是sgRNA和基因summary文件:countsummary.txt、rra.gene_summary.txt、rra.sgrna_summary.txt。
1.下载安装
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("MAGeCKFlute")
library(MAGeCKFlute)
2.质控结果可视化
QC 结果查看
file1 = file.path(system.file("extdata", package = "MAGeCKFlute"),
"testdata/countsummary.txt")
countsummary = read.delim(file1, check.names = FALSE)
#基尼指数
BarView(countsummary, x = "Label", y = "GiniIndex",ylab = "Gini index", main = "Evenness of sgRNA reads")
1.jpeg
#缺失sgrna
countsummary$Missed = log10(countsummary$Zerocounts)
BarView(countsummary, x = "Label", y = "Missed", fill = "#394E80",ylab = "Log10 missed gRNAs", main = "Missed sgRNAs")
#read mapping ratio
MapRatesView(countsummary)
2.jpeg
3.jpeg
- MAGeCK RRA下游分析
在只有两个实验条件的情况下,MAGeCK-RRA可用于CRISPR/Cas9 screens以鉴定必需基因,计算结果sgRNA summary 和 gene summary文件分别在sgRNA和gene水平上统计了正向选择和负向选择的显著性等信息。
3.1 正负选择结果可视化
#read gene summary data(required)
file2 = file.path(system.file("extdata", package = "MAGeCKFlute"), "testdata/rra.gene_summary.txt")
gdata = ReadRRA(file2)
#read sgRNA summary data(optional)
file3 = file.path(system.file("extdata", package = "MAGeCKFlute"), "testdata/rra.sgrna_summary.txt")
sdata = ReadsgRRA(file3)
#火山图Volcano plot
gdata$LogFDR = -log10(gdata$FDR)
p1 = ScatterView(gdata, x = "Score", y = "LogFDR", label = "id", model = "volcano", top = 5)
print(p1)
# Or
p2 = VolcanoView(gdata, x = "Score", y = "FDR", Label = "id")
print(p2)
4.jpeg
ScatterView用于绘制散点图,其参数model有4个可选项:"none", "ninesquare", "volcano", "rank’’,’’ninesquare’’用于制作九宫格图,可在mle模型中选择使用,"rank’’为根据基因得分排序作图。
3.2 gene rank plot
根据基因得分进行作图并展示部分基因
#Rank plot
gdata$Rank = rank(gdata$Score)
p3 = ScatterView(gdata, x = "Rank", y = "Score", label = "id",
top = 5, auto_cut_y = TRUE, ylab = "Log2FC",
groups = c("top", "bottom"))
print(p3)
5.jpeg
通过参数toplabels (in ScatterView) 和 genelist (in RankView)设置想要展示的gene label
ScatterView(gdata, x = "Score", y = "Rank", label = "id",
auto_cut_x = TRUE, groups = c("left", "right"),
xlab = "Log2FC", top = 3)
#or
geneList= gdata$Score
names(geneList) = gdata$id
p4 = RankView(geneList, top = 5, bottom = 10) + xlab("Log2FC")
print(p4)
6.jpeg
7.jpeg
3.3 sgRankView
除了展示候选基因外,也可以查看top gene的sgRNA rank
p5 = sgRankView(sdata, top = 4, bottom = 4)
print(p5)
8.jpeg
3.4 富集分析
geneList= gdata$Score
names(geneList) = gdata$id
enrich = EnrichAnalyzer(geneList = geneList[geneList>0.5],
method = "HGT", type = "KEGG")
EnrichedView(enrich, mode = 1, top = 5)
#or
EnrichedView(enrich, mode = 2, top = 5)
9.jpeg
10.jpeg
参考
1.Wei Li, Tengfei Xiao, Han Xu, and X Shirley Liu. 2014. “MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens.”
2.MAGeCKFlute - Integrative analysis pipeline for pooled CRISPR functional genetic screens
3.MAGeCKFlute - Functional enrichment analysis in MAGeCKFlute
4.Luo, Weijun, Brouwer, and Cory. 2013. “Pathview: An R/Bioconductor Package for Pathway-Based Data Integration and Visualization.” Bioinformatics 29 (14):1830–1. https://doi.org/10.1093/bioinformatics/btt285.
5.Ophir Shalem1, *, 2. 2014. “Genome-scale CRISPR-Cas9 knockout screening in human cells.” http://science.sciencemag.org/content/343/6166/84.long.
6.Wei Li, Han Xu, Johannes Köster, and X. Shirley Liu. 2015. “Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR.” https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0843-6.
网友评论