取交集就用intersectBed_mob604756f4ef89的技术博客_51CTO博客
生信小工具:bedtools的使用(2) - 简书 (jianshu.com)
https://www.jianshu.com/p/3f9f231072c6
bedtools merge
#先排序
sort -k1,1 -k2,2n foo.bed > foo.sort.bed`
#merge的结果会产生一组新的区间,表示输入中合并的区间组。也就是说,如果基因组中的碱基对被10个特征覆盖,现在它将仅在输出文件中表示一次
bedtools merge -i exons.bed
计算重叠间隔的数量
更复杂的方法是不仅合并重叠间隔,还报告输出能够merge到新合并间隔中的间隔的个数。一般通过-c和-o两个option来实现。
-c option允许您在输入中指定要汇总的一列或多列。-o option 可以用来定义要应用于为-c option列出的每个列的对应的操,这里可以用count,mean,sum,min,max等等不同的操作。
mergeBed --help
-c Specify columns from the B file to map onto intervals in A.
Default: 5.
Multiple columns can be specified in a comma-delimited list.
-o Specify the operation that should be applied to -c.
Valid operations:
sum, min, max, absmin, absmax,
mean, median, mode, antimode
stdev, sstdev
collapse (i.e., print a delimited list (duplicates allowed)),
distinct (i.e., print a delimited list (NO duplicates allowed)),
distinct_sort_num (as distinct, sorted numerically, ascending),
distinct_sort_num_desc (as distinct, sorted numerically, desscending),
distinct_only (delimited list of only unique values),
count
count_distinct (i.e., a count of the unique values in the column),
first (i.e., just the first value in the column),
last (i.e., just the last value in the column),
Default: sum
Multiple operations can be specified in a comma-delimited list.
If there is only column, but multiple operations, all operations will be
applied on that column. Likewise, if there is only one operation, but
multiple columns, that operation will be applied to all columns.
Otherwise, the number of columns must match the the number of operations,
and will be applied in respective order.
E.g., "-c 5,4,6 -o sum,mean,count" will give the sum of column 5,
the mean of column 4, and the count of column 6.
The order of output columns will match the ordering given in the command.
-delim Specify a custom delimiter for the collapse operations.
- Example: -delim "|"
- Default: ",".
-prec Sets the decimal precision for output (Default: 5)
-bed If using BAM input, write output as BED.
-header Print the header from the A file prior to results.
-nobuf Disable buffered output. Using this option will cause each line
of output to be printed as it is generated, rather than saved
in a buffer. This will make printing large output files
noticeably slower, but can be useful in conjunction with
other software tools and scripts that need to process one
line of bedtools output at a time.
-iobuf Specify amount of memory to use for input buffer.
Takes an integer argument. Optional suffixes K/M/G supported.
Note: currently has no effect with compressed files.
网友评论