我们利用Pyclone和CITUP得到了三个文件即cellfreq.txt和tree.txt 和sample_id,下面我们就利用TimeScape搞一下可视化,在这里不会出现具体的基因或突变位点,但是都是可以追述的。
关于安装问题
安装很简单,利用BioManager即可顺利安装完成。
# try http:// if https:// URLs are not supported
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("timescape")
关于输入数据获得
该输入数据是以每个样本每个克隆的克隆系统发育和克隆流行率作为输入。将之前的主要命令行整理,获得三个输入文件:cellfreq.txt,tree.txt 和 sample_id,如下:
####利用PyClone获得文件 tables/loci.tsv,再生产三个文件freq.txt,cluster.txt 和 sample_id
PyClone run_analysis_pipeline --in_files SRR385938.tsv SRR385939.tsv SRR385940.tsv SRR385941.tsv --working_dir pyclone_analysis
cat ./pyclone_analysis/tables/loci.tsv | cut -f 6 | sed '1d' | paste - - - - > ./pyclone_analysis/freq.txt
cat ./pyclone_analysis/tables/loci.tsv | cut -f 3 | sed '1d' | paste - - - - |cut -f 1 > ./pyclone_analysis/cluster.txt
cat ./pyclone_analysis/tables/loci.tsv | cut -f 2 | sed '1d' | head -4 > ./pyclone_analysis/sample_id
####利用CITUP获得文件results.h5
run_citup_qip.py ./pyclone_analysis/freq.txt ./pyclone_analysis/cluster.txt ./pyclone_analysis/results.h5
####利用ReadH5.py读取results.h5获得两个文件cellfreq.txt 和 tree.txt
python ReadH5.py ./pyclone_analysis/results.h5 | sed 's/^ \[//;s/\[//g;s/\]//g' | tr ' ' '\t'| grep '\.' > ./pyclone_analysis/cellfreq.txt
python ReadH5.py ./pyclone_analysis/results.h5 | sed 's/^ \[//;s/\[//g;s/\]//g' | tr ' ' '\t'| grep -v '\.' > ./pyclone_analysis/tree.txt
关于运行问题
软件包自带例子好几个,但是内部使用example,直接运行,估计大家都一头雾水,其实直接看:
example("timescape") ###生成多个可视化将出现在您的浏览器中
browseVignettes("timescape")
#or:
?timescape
看例子吧,如下:
# EXAMPLE 1 - Acute myeloid leukemia patient, Ding et al., 2012
# genotype tree edges
tree_edges <- read.csv(system.file("extdata", "AML_tree_edges.csv", package = "timescape"))
tree_edges
source target
1 1 2
2 1 3
3 3 4
4 4 5
# clonal prevalences
clonal_prev <- read.csv(system.file("extdata", "AML_clonal_prev.csv", package = "timescape"))
clonal_prev
timepoint clone_id clonal_prev
1 Diagnosis 1 0.1274
2 Diagnosis 2 0.5312
3 Diagnosis 3 0.2904
4 Diagnosis 4 0.0510
5 Relapse 5 1.0000
# targeted mutations
mutations <- read.csv(system.file("extdata", "AML_mutations.csv", package = "timescape"))
head(mutations)
Tier chrom coord clone_id timepoint VAF
1 3 1 2554021 1 Diagnosis 0.4383
2 3 1 11965332 1 Diagnosis 0.4123
3 3 1 18952534 1 Diagnosis 0.4891
4 3 1 20382629 1 Diagnosis 0.4754
5 3 1 28395117 5 Diagnosis 0.0004
6 3 1 30729775 1 Diagnosis 0.4812
# perturbations
perturbations <- data.frame( pert_name = c("Chemotherapy"),
prev_tp = c("Diagnosis"))
perturbations
pert_name prev_tp
1 Chemotherapy Diagnosis
# run timescape
timescape(clonal_prev = clonal_prev, tree_edges = tree_edges, perturbations = perturbations, height=260)
关于参数问题
TimeScape需要配置的参数包括必要参数选参数。
必要参数如下:
clonal_prev是由每个时间点每个克隆的克隆流行率组成的数据帧,该数据的列为:
character() timepoint - timepoint
character() clone_id - clone id
numeric() clonal_prev -clonal prevalence。
Tree_edges是描述根克隆系统发育的边缘的数据框架,该数据的列为:
character() source - source node id
character() target - target node id。
可选参数如下:
突变是由每个克隆中产生的突变组成的数据表格,如果提供了这个参数,一个突变表将出现在视图的底部。该数据的列为:
character() chrom - chromosome number
numeric() coord - coordinate of mutation on chromosome
character() clone_id - clone id
character() timepoint - time point
numeric() VAF - variant allele frequency of the mutation in the corresponding timepoint.
实际数据测试
利用我们生产的实际数据cellfreq.txt,tree.txt 和 sample_id作为输入数据,来获得分析结果,如下:
library(timescape)
options(stringsAsFactors = F)
#example("timescape")
#browseVignettes("timescape")
library(plotly)
library(htmlwidgets)
library(webshot)
library(tidyr)
tree_edges = read.table("tree.txt")
colnames(tree_edges) = c("source","target")
# clonal prevalences
cellfreq = read.table("cellfreq.txt")
colnames(cellfreq) = 0:(length(cellfreq)-1)
sample_id = read.table("sample_id")
cellfreq$timepoint = sample_id[ , 1]
clonal_prev = gather(cellfreq, key="clone_id", value = "clonal_prev", -timepoint)
clonal_prev = clonal_prev[order(clonal_prev$timepoint),]
clonal_prev
# targeted mutations
# mutations <- read.csv(system.file("extdata", "AML_mutations.csv", package = "timescape"))
p = timescape(clonal_prev = clonal_prev, tree_edges = tree_edges,height=260)
saveWidget(p, "test.html")
关于生成文件特征
该软件包生产.html格式文件,可以用浏览器直接打开,之后可以保存为SVG,或者 PNG,一般我习惯选择SVG,之后方便使用AI编辑使用。测试结果如下:
关注公众号,桓峰基因,每日更新,扫码进群交流不停歇,马上就出视频版,关注我,您最佳的选择!
References:
Ding, Li, et al. “Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing.” Nature 481.7382 (2012): 506-510.
Ha, Gavin, et al. “TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data.” Genome research 24.11 (2014): 1881-1893.
Malikic, Salem, et al. “Clonality inference in multiple tumor samples using phylogeny.” Bioinformatics 31.9 (2015): 1349-1356.
McPherson, Andrew, et al. “Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer.” Nature genetics (2016).
网友评论