一、背景介绍
通常,BED文件的overlap/intersect的操作,可以BEDtools intersect功能完成overlap和统计:
Bedtolols intersect: Find overlapping intervals in various ways
2-3个文件H还比较简单,多个BED文件操作起来稍显繁琐。这里介绍一款基于BEDtools功能的软件:Intervene。
Intervene是一个用于intersect多个基因组区域(BED)和基因集的的工具,简而言之,实现韦恩图的功能,这对于我们去寻找交集非常方便,比对寻找多个BED/GTF/GFF/SNP文件(如peaks、SV等)的交并集结果。
软件官方介绍:
https://intervene.readthedocs.io/en/latest/introduction.html
功能介绍:
主要有3个功能,
(1) venn to compute Venn diagrams of up-to 6 sets
图片.png(2) upset to compute UpSet plots of multiple sets
图片.png(3) pairwise to compute and visualize intersections of genomic sets as clustered heatmap
图片.png三、软件安装
# pip
pip install intervene
# 源码
git clone https://github.com/asntech/intervene.git
cd intervene
python setup.py sdist install
# 必需的依赖包
Python (=> 2.7 ): https://www.python.org/
BEDTools (Latest version): https://github.com/arq5x/bedtools2
pybedtools (>= 0.7.9): https://daler.github.io/pybedtools/
Pandas (>= 0.16.0): http://pandas.pydata.org/
Seaborn (>= 0.7.1): http://seaborn.pydata.org/
R (>= 3.0): https://www.r-project.org/
R packages including UpSetR, corrplot
四、如何使用
测试数据:
H3K27ac.bed
H3K27me3.bed
H3K4me1.bed
H3K4me2.bed
H3K4me3.bed
数据格式:
BED数据的格式示例如下
chr1 713131 713668 . 560 . 12.759191 15.7 -1
chr1 713872 714208 . 850 . 23.636091 15.7 -1
chr1 714437 714760 . 850 . 23.623177 100.0 -1
chr1 761967 763114 . 701 . 18.056475 15.7 -1
chr1 820974 826939 . 275 . 2.062345 2.3 -1
intervenn子功能
intervene <subcommand> [options]
Intervene: a tool for intersection and visualization of multiple genomic region and gene sets.
For more details check documentation: http://intervene.readthedocs.io
positional arguments:
{venn,upset,pairwise}
List of subcommands
venn Venn diagram of intersection of genomic regions or list sets (upto 6-way).
upset UpSet diagram of intersection of genomic regions or list sets.
pairwise Pairwise intersection and heatmap of N genomic region sets in <BED/GTF/GFF> format.
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-c, --cite show citation information and exit
一、venn功能:
intervene venn -i example_data/ENCODE_hESC/*bed --output test/venn --save-overlaps
Generating a 5-way "venn" diagram. Please wait...
Done! Please check your results @ test/venn.
Thank you for using Intervene!
# 最多可实现6-way venn
# 基于bedtools功能,可以支持调整bedtools相关的参数,比如-f
测试图形:
venn二、upset功能:
intervene upset -i example_data/ENCODE_hESC/*bed --output test/upset
可以把计算的数据导入网站在线绘制
三、pairwise功能
intervene pairwise -i example_data/ENCODE_hESC/*bed --type genomic --compute jaccard --htype tribar --output test/pairwise
测试结果:
pairwise.png图形三角的朝向可以使用--triangle
控制,比如upper;只有设置计算方法为jaccard
或 fisher
后才建议使用tribar
参数
另外,上述分析结果均会做图的数据,该数据也可以导入在线的网(https://asntech.shinyapps.io/intervene/)进行展示,不过网站有点慢。
参考文献:
Khan A, Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics. 2017;18:287. doi: 10.1186/s12859-017-1708-7
网友评论