融合基因,是指将两个或多个基因的编码区首尾相连,置于同一套调控序列(包括启动子、增强子、核糖体结合序列、终止子等)控制之下,构成的嵌合基因,融合基因的表达产物为融合蛋白。如下图,图片来源:www.bloodjournal.org/。
常用融合基因分析软件
常见融合基因分析软件SOAPfusion : 华大,hash index算法;
STAR-Fusion :Aviv Regev 实验室,
- 先将reads通过STAR比对到参考基因组,筛选出split和discordant reads作为候选的融合基因序列;
- 将候选融合基因序列与参考基因序列进行比对,根据overlaps预测出融合基因;
*对预测结果做过滤,去除假阳性结果。
可视化分析(基于STAR-Fusion结果)
- IGV
- chimeraviz
- Circos
1.IGV
输入文件
基于STAR-Fusion结果,通过配套的FusionInspector进行过滤及整合;
FusionInspector --fusions Sam_fusionList.txt \
-O Sample --genome_lib genome/genome \
--left_fq clean/Sam.R1.fq.gz --right_fq clean/Sam.R2.fq.gz \
--out_prefix finspector test --vis #输出可视化文件
finspector.fa : the candidate fusion-gene contigs
finspector.bed : the reference gene structure annotations for fusion partners
finspector.junction_reads.bam : alignments of the breakpoint-junction supporting reads.
finspector.spanning_reads.bam : alignments of the breakpoint-spanning paired-end reads.
官方示例图如下
IGV-fusion.png
2.chimeraviz
输入文件
基于STAR-fusion结果,star-fusion.fusion_candidates.final.abridged.FFPM
FFPM
fusions <- import_starfusion("test.result.fusion.txt","hg38")
length(fusions)
fusion <- plot_(fusions,which_transcripts = "exon_Boundary")
plot_circle(fusions),height = 8, width = 8,dpi = 400)
###其他物种结果可视化
defuse833ke <- system.file( "extdata", "defuse_833ke_results.filtered.tsv", package = "chimeraviz")
fusion5267and11759reads <- system.file( "extdata", "fusion5267and11759reads.bam", package = "chimeraviz")
fusions <- import_defuse(defuse833ke, "hg38")
fusion <- get_fusion_by_gene_name(fusions,"RCC1")
fusion <- get_fusion_by_id(fusions, 5267)
edbSqliteFile <- system.file( "extdata", "Homo_sapiens.GRCh37.74.sqlite", package="chimeraviz")
count <- system.file("extdata","fusion5267and11759reads.bedGraph", package="chimeraviz")
edb <- ensembldb::EnsDb(edbSqliteFile)
plot_fusion(fusion,#bamfile = fusion5267and11759reads ,
edb = edb,non_ucsc = T,
reduce_transcripts = T,bedgraphfile = count)
plot_fusion(fusion,#bamfile = fusion5267and11759reads ,
edb = edb,non_ucsc = F,
reduce_transcripts = T,bedgraphfile = count)
转录本信息
基因信息
3.Circos
输入文件
基于STAR-fusion结果,star-fusion.fusion_candidates.final.abridged
abridge
汇总:
1.IGV+FusionInspector:
步骤繁琐,文件冗余较多;展示结果清晰明了;
2.chimeraviz :
自定义创建数据库文件有限制;支持多种融合分析内容输入,结果可视化类型丰富;
3.Circos:
文件整理和conf比较繁琐;可视化结果自定义化程度高,较为美观;
path /public/cluster2/works/lipeng/houxu/HT2020-0110/test/circos
其他相关软件及示意图
- karyoploteR
- 3D Genome Browser
- TBtools
- Sushi.R
image-20200325101908439.png
网友评论