美文网首页实用小公举
bedtools makewindows 生成滑窗区域文本

bedtools makewindows 生成滑窗区域文本

作者: Amy_Cui | 来源:发表于2019-07-09 13:08 被阅读0次

    学习目的,就为了得到这个文本,命令:bedtools makewindows -g genome.chr.ln -w 200000 >200K.genome.3col;当然你还可以写一个单行命令来处理。我这里用的是bedtools软件。

    genome.chr.ln示例.png 结果文件:相隔200k的bed文件

    安装

    官网各种安装方法

    conda 安装

    $ conda install bedtools
    $ bedtools --help
    bedtools is a powerful toolset for genome arithmetic.
    
    Version:   v2.28.0
    About:     developed in the quinlanlab.org and by many contributors worldwide.
    Docs:      http://bedtools.readthedocs.io/
    Code:      https://github.com/arq5x/bedtools2
    Mail:      https://groups.google.com/forum/#!forum/bedtools-discuss
    
    Usage:     bedtools <subcommand> [options]
    ...
    

    源码安装

    $ wget https://github.com/arq5x/bedtools2/releases/download/v2.28.0/bedtools-2.28.0.tar.gz
    $ tar -zxvf bedtools-2.28.0.tar.gz
    $ cd bedtools2
    $ make
    

    系统安装

    需要管理员权限

    Fedora / Centos。Adam Huffman为bedtools创建了一个Red Hat软件包,以便可以使用Fedora软件包管理器“yum”轻松安装最新版本。它应该适用于Fedora 13,14和EPEL5 / 6(适用于Centos,Scientific Linux等)。

    yum install BEDTools
    

    于Debian / Ubuntu。Charles Plessy还维护着一个Debian软件包,可以在Ubuntu等衍生产品中找到。非常感谢Charles这样做。

    apt-get install bedtools
    

    自制。Carlos Borroto已经在OSX的bedtools包管理器上提供了BEDTools。

    brew tap homebrew/science
    brew install bedtools
    

    MacPorts。或者,MacPorts端口系统可用于在OSX上安装BEDTools。

    port install bedtools
    

    makewindows的使用

    查看bedtools makewindows 的帮助文档,最下面是示例,非常实用,看完就会了

    准备-g文件

    samtools dict /public/reference/genome/hg38/hg38.fa >hg38.fa.dict
    # One can use the UCSC Genome Browser's MySQL database to extract: chromosome sizes. For example, H. sapiens:
    #  mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e \
    #        "select chrom, size from hg19.chromInfo" > hg19.genome
    cat hg38.fa.dict|grep -v '^@HD'|sed 's/:/\t/g'|cut -f 3,5 >genome.chr.ln
    bedtools makewindows -g genome.chr.ln -w 200000 >200K.genome.3col
    

    结束我的需求!


    帮助文档示例写的很明白

    
    *****
    *****ERROR: Need -g (genome file) or -b (BED file) for interval source. 
    *****
    
    *****
    *****ERROR: Need -w (window size) or -n (number of windows). 
    *****
    
    Tool: bedtools makewindows
    Version: v2.28.0
    Summary: Makes adjacent or sliding windows across a genome or BED file.
    
    Usage: bedtools makewindows [OPTIONS] [-g <genome> OR -b <bed>]
     [ -w <window_size> OR -n <number of windows> ]
    
    Input Options: 
        -g <genome>
            Genome file size (see notes below).
            Windows will be created for each chromosome in the file.
    
        -b <bed>
            BED file (with chrom,start,end fields).
            Windows will be created for each interval in the file.
    
    Windows Output Options: 
        -w <window_size>
            Divide each input interval (either a chromosome or a BED interval)
            to fixed-sized windows (i.e. same number of nucleotide in each window).
            Can be combined with -s <step_size>
    
        -s <step_size>
            Step size: i.e., how many base pairs to step before
            creating a new window. Used to create "sliding" windows.
            - Defaults to window size (non-sliding windows).
    
        -n <number_of_windows>
            Divide each input interval (either a chromosome or a BED interval)
            to fixed number of windows (i.e. same number of windows, with
            varying window sizes).
    
        -reverse
             Reverse numbering of windows in the output, i.e. report 
             windows in decreasing order
    
    ID Naming Options: 
        -i src|winnum|srcwinnum
            The default output is 3 columns: chrom, start, end .
            With this option, a name column will be added.
             "-i src" - use the source interval's name.
             "-i winnum" - use the window number as the ID (e.g. 1,2,3,4...).
             "-i srcwinnum" - use the source interval's name with the window number.
            See below for usage examples.
    
    Notes: 
        (1) The genome file should tab delimited and structured as follows:
         <chromName><TAB><chromSize>
    
        For example, Human (hg19):
        chr1    249250621
        chr2    243199373
        ...
        chr18_gl000207_random   4262
    
    Tips: 
        One can use the UCSC Genome Browser's MySQL database to extract
        chromosome sizes. For example, H. sapiens:
    
        mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e \
        "select chrom, size from hg19.chromInfo" > hg19.genome
    
    Examples: 
     # Divide the human genome into windows of 1MB:
     $ bedtools makewindows -g hg19.txt -w 1000000
     chr1 0 1000000
     chr1 1000000 2000000
     chr1 2000000 3000000
     chr1 3000000 4000000
     chr1 4000000 5000000
     ...
    
     # Divide the human genome into sliding (=overlapping) windows of 1MB, with 500KB overlap:
     $ bedtools makewindows -g hg19.txt -w 1000000 -s 500000
     chr1 0 1000000
     chr1 500000 1500000
     chr1 1000000 2000000
     chr1 1500000 2500000
     chr1 2000000 3000000
     ...
    
     # Divide each chromosome in human genome to 1000 windows of equal size:
     $ bedtools makewindows -g hg19.txt -n 1000
     chr1 0 249251
     chr1 249251 498502
     chr1 498502 747753
     chr1 747753 997004
     chr1 997004 1246255
     ...
    
     # Divide each interval in the given BED file into 10 equal-sized windows:
     $ cat input.bed
     chr5 60000 70000
     chr5 73000 90000
     chr5 100000 101000
     $ bedtools makewindows -b input.bed -n 10
     chr5 60000 61000
     chr5 61000 62000
     chr5 62000 63000
     chr5 63000 64000
     chr5 64000 65000
     ...
    
     # Add a name column, based on the window number: 
     $ cat input.bed
     chr5  60000  70000 AAA
     chr5  73000  90000 BBB
     chr5 100000 101000 CCC
     $ bedtools makewindows -b input.bed -n 3 -i winnum
     chr5        60000   63334   1
     chr5        63334   66668   2
     chr5        66668   70000   3
     chr5        73000   78667   1
     chr5        78667   84334   2
     chr5        84334   90000   3
     chr5        100000  100334  1
     chr5        100334  100668  2
     chr5        100668  101000  3
     ...
    
     # Reverse window numbers: 
     $ cat input.bed
     chr5  60000  70000 AAA
     chr5  73000  90000 BBB
     chr5 100000 101000 CCC
     $ bedtools makewindows -b input.bed -n 3 -i winnum -reverse
     chr5        60000   63334   3
     chr5        63334   66668   2
     chr5        66668   70000   1
     chr5        73000   78667   3
     chr5        78667   84334   2
     chr5        84334   90000   1
     chr5        100000  100334  3
     chr5        100334  100668  2
     chr5        100668  101000  1
     ...
    
     # Add a name column, based on the source ID + window number: 
     $ cat input.bed
     chr5  60000  70000 AAA
     chr5  73000  90000 BBB
     chr5 100000 101000 CCC
     $ bedtools makewindows -b input.bed -n 3 -i srcwinnum
     chr5        60000   63334   AAA_1
     chr5        63334   66668   AAA_2
     chr5        66668   70000   AAA_3
     chr5        73000   78667   BBB_1
     chr5        78667   84334   BBB_2
     chr5        84334   90000   BBB_3
     chr5        100000  100334  CCC_1
     chr5        100334  100668  CCC_2
     chr5        100668  101000  CCC_3
     ...
    

    相关文章

      网友评论

        本文标题:bedtools makewindows 生成滑窗区域文本

        本文链接:https://www.haomeiwen.com/subject/wsptkctx.html