上期利用sciClone已经全面的复现了文章的相关内容,下面开始介绍ClonEvol软件的使用及相关问题,包括文章使用软件复现的图表。
背景:克隆进化重建是了解肿瘤进展和实施个性化治疗的关键。这通常是通过聚集体细胞变异,基于其细胞流行率,通过多个样本的大块肿瘤测序估计。然后根据估计的细胞流行率对由克隆标记变异组成的簇进行排序,以重建克隆进化树,这一过程称为克隆排序。然而,细胞流行率的估计被统计变异性和测序/数据分析中的错误所混淆,因此抑制了克隆进化的精确重建。这个问题因肿瘤内和肿瘤间的异质性而进一步复杂化。此外,该领域缺乏一个全面的可视化工具,以促进解释复杂的克隆关系。为了应对这些挑战,开发了ClonEvol,这是一个用于克隆排序、可视化和解释的统一软件工具。
材料和方法:ClonEvol使用bootstrap重采样技术来估计克隆的细胞部分,并对克隆排序约束进行概率建模,以考虑统计变变性。引导允许识别样品建立和亚克隆,从而能够解释克隆亚种。ClonEvol自动生成用于重建和解释克隆进化的多种广泛使用的可视化效果。
结果:在克隆进化推断方面,ClonEvol优于目前最先进的三种工具(LICHeE、Canopy和PhyloWGS),显示出更强的容错能力,并在模拟中产生更精确的树。基于最近发表的多篇利用ClonEvol研究实体癌转移和耐药性的文章,我们发现ClonEvol在两例已发表的急性髓系白血病患者中再次发现了复发的亚克隆。此外,我们还证明,通过非侵入性监测,ClonEvol再现了一名已发表的乳腺癌患者肿瘤转移进展过程中出现的亚克隆。
结论:ClonEvol在肿瘤活检中具有广泛的纵向监测克隆群体的适用性,或无创监测,以指导精准医学。
可用性:ClonEvol是用R编写的,可获得地址: https://github.com/ChrisMaherLab/ClonEvol
该软件延续了sciClone, Pyclone等软件后续的可视化分析,使得肿瘤的亚种更加清晰明了。下面代码将完成文章中克隆进化Figure 1图形。
主要三个步骤:
步骤1:准备克隆进化推断的变异数据
测序的深度,样本的数量和质量,以及体细胞变异的数量和质量可能会对克隆进化模型产生深远的影响。在理想的情况下,我们希望:
大量样本;
大量变异(外显子组测序可以,但全基因组测序可以更好地覆盖全基因组体细胞突变);
多个时间点;
多区域样本(由于肿瘤内的异质性);
深度测序。
步骤2:对变异数据进行聚类
步骤3:评估变量聚类结果
具体细节及代码见下文:
安装软件包
#ClonEvol requires R 2.15 or later. Install from Github
install.packages('devtools')
library(devtools)
install_github('hdng/clonevol')
install.packages('gridBase')
install.packages('gridExtra')
install.packages('ggplot2')
install.packages('igraph')
install.packages('packcircles')
数据准备
#########实列分析
library(clonevol)
data(aml1)
x <- aml1$variant
head(x)
cluster gene is.driver P.vaf R.vaf P.ccf R.ccf P.ref.count
226 1 SMC3 TRUE 50.28 49.13 100.56 98.26 3027
349 1 PTPRT TRUE 45.24 45.31 90.48 90.62 2159
1 1 - FALSE 43.83 41.37 87.66 82.74 2931
2 1 - FALSE 41.23 46.47 82.46 92.94 2203
3 1 - FALSE 48.91 43.11 97.82 86.22 2534
4 1 - FALSE 47.54 44.45 95.08 88.90 2539
6 1 - FALSE 48.12 45.47 96.24 90.94 2210
7 1 - FALSE 47.03 41.13 94.06 82.26 2387
8 1 - FALSE 51.04 45.38 102.08 90.76 2225
##数据准备
# shorten vaf column names as they will be
vaf.col.names <- grep('.vaf', colnames(x), value=T)
sample.names <- gsub('.vaf', '', vaf.col.names)
x[, sample.names] <- x[, vaf.col.names]
vaf.col.names <- sample.names
# prepare sample grouping
sample.groups <- c('P', 'R');
names(sample.groups) <- vaf.col.names
# setup the order of clusters to display in various plots (later)
x <- x[order(x$cluster),]
head(x)
cluster gene is.driver P.vaf R.vaf P.ccf R.ccf P.ref.count P.var.count P.depth R.ref.count R.var.count R.depth P R
226 1 SMC3 TRUE 50.28 49.13 100.56 98.26 3027 2611 5639 2035 1990 4024 50.28 49.13
349 1 PTPRT TRUE 45.24 45.31 90.48 90.62 2159 1799 3958 1715 1579 3294 45.24 45.31
1 1 - FALSE 43.83 41.37 87.66 82.74 2931 2364 5296 2224 1501 3725 43.83 41.37
2 1 - FALSE 41.23 46.47 82.46 92.94 2203 1658 3862 1733 1587 3320 41.23 46.47
3 1 - FALSE 48.91 43.11 97.82 86.22 2534 2358 4893 2605 1873 4479 48.91 43.11
4 1 - FALSE 47.54 44.45 95.08 88.90 2539 2271 4810 2325 2041 4366 47.54 44.45
########添加颜色
clone.colors <- c('#999793', '#8d4891', '#f8e356', '#fe9536', '#d7352e')
可视化变异簇
##Visualizing the variant clusters
plot.variant.clusters(x,
cluster.col.name = 'cluster',
show.cluster.size = FALSE,
cluster.size.text.color = 'blue',
vaf.col.names = vaf.col.names,
vaf.limits = 70,
sample.title.size = 20,
violin = FALSE,
box = FALSE,
jitter = TRUE,
jitter.shape = 1,
jitter.color = clone.colors,
jitter.size = 3,
jitter.alpha = 1,
jitter.center.method = 'median',
jitter.center.size = 1,
jitter.center.color = 'darkgray',
jitter.center.display.value = 'none',
highlight = 'is.driver',
highlight.shape = 21,
highlight.color = 'blue',
highlight.fill.color = 'green',
highlight.note.col.name = 'gene',
highlight.note.size = 2,
order.by.total.vaf = FALSE)
绘制样本之间的VAFs或CCFs(cancer cell fraction)
plot.pairwise(x, col.names = vaf.col.names,
out.prefix = 'variants.pairwise.plot',
colors = clone.colors)
样本间聚类的平均值/中值(聚类流)
plot.cluster.flow(x, vaf.col.names = vaf.col.names,
sample.names = c('Primary', 'Relapse'),
colors = clone.colors)
使用ClonEvol进行克隆排序
##推断克隆进化树
y = infer.clonal.models(variants = x,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
sample.groups = sample.groups,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = 1,
cluster.center = 'mean',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.01,
# min probability that CCF(clone) is non-negative
sum.p = 0.05,
# alpha level in confidence interval estimate for CCF(clone)
alpha = 0.05)
##将驱动基因事件映射到树中
y <- transfer.events.to.consensus.trees(y,
x[x$is.driver,],
cluster.col.name = 'cluster',
event.col.name = 'gene')
##将基于节点的树转换为基于树枝的树
y <- convert.consensus.tree.clone.to.branch(y, branch.scale = 'sqrt')
多个模块和进化树的整合
通过plot.clonal.models可视化克隆进化模型一次性完成绘制变异簇、钟形图和克隆进化树。
##把多个地块和树木画在一起
plot.clonal.models(y,
# box plot parameters
box.plot = TRUE,
fancy.boxplot = TRUE,
fancy.variant.boxplot.highlight = 'is.driver',
fancy.variant.boxplot.highlight.shape = 21,
fancy.variant.boxplot.highlight.fill.color = 'red',
fancy.variant.boxplot.highlight.color = 'black',
fancy.variant.boxplot.highlight.note.col.name = 'gene',
fancy.variant.boxplot.highlight.note.color = 'blue',
fancy.variant.boxplot.highlight.note.size = 2,
fancy.variant.boxplot.jitter.alpha = 1,
fancy.variant.boxplot.jitter.center.color = 'grey50',
fancy.variant.boxplot.base_size = 12,
fancy.variant.boxplot.plot.margin = 1,
fancy.variant.boxplot.vaf.suffix = '.VAF',
# bell plot parameters
clone.shape = 'bell',
bell.event = TRUE,
bell.event.label.color = 'blue',
bell.event.label.angle = 60,
clone.time.step.scale = 1,
bell.curve.step = 2,
# node-based consensus tree parameters
merged.tree.plot = TRUE,
tree.node.label.split.character = NULL,
tree.node.shape = 'circle',
tree.node.size = 30,
tree.node.text.size = 0.5,
merged.tree.node.size.scale = 1.25,
merged.tree.node.text.size.scale = 2.5,
merged.tree.cell.frac.ci = FALSE,
# branch-based consensus tree parameters
merged.tree.clone.as.branch = TRUE,
mtcab.event.sep.char = ',',
mtcab.branch.text.size = 1,
mtcab.branch.width = 0.75,
mtcab.node.size = 3,
mtcab.node.label.size = 1,
mtcab.node.text.size = 1.5,
# cellular population parameters
cell.plot = TRUE,
num.cells = 100,
cell.border.size = 0.25,
cell.border.color = 'black',
clone.grouping = 'horizontal',
#meta-parameters
scale.monoclonal.cell.frac = TRUE,
show.score = FALSE,
cell.frac.ci = TRUE,
disable.cell.frac = FALSE,
# output figure parameters
out.dir = 'output',
out.format = 'pdf',
overwrite.output = TRUE,
width = 8,
height = 4,
# vector of width scales for each panel from left to right
panel.widths = c(3,4,2,4,2))
到此为止综合性图已完成,其中涉及的参数有些多,如果不是深入的了解其中的算法,建议默认既可以。
Reference:
Dang, H. X., White, B. S., Foltz, S. M., Miller, C. A., Luo, J., Fields, R. C., & Maher, C. A. (2017). ClonEvol: clonal ordering and visualization in cancer sequencing. Annals of Oncology, 28(12), 3076-3082.
网友评论