美文网首页
Rattus norvegicus基因表达探针lncRNA重注释

Rattus norvegicus基因表达探针lncRNA重注释

作者: dming1024 | 来源:发表于2019-05-02 18:50 被阅读0次

    上一次讲的这个大鼠基因表达探针的重注释出现的了个小问题,在重注释的最后一步有这个报错:

    intersectBed -a Rat230_2_probe.bed -b Rattus_norvegicus.Rnor_6.0.96.gtf -wa -wb > x.txt
    ***** WARNING: File Rat230_2_probe.bed has inconsistent naming convention for record:
    chr6    108169080   108169105   Rat230_2:1367452_at;    1   -
    
    ***** WARNING: File Rat230_2_probe.bed has inconsistent naming convention for record:
    chr6    108169080   108169105   Rat230_2:1367452_at;    1   -
    

    报错提示说是chr6这个位点有个不一致的命名,接着我就查看了下这个位点到底是怎么回事

    cat Rat230_2_probe.bed|sed -n '/108169080/p'
    chr6    108169080   108169105   Rat230_2:1367452_at;    1   -
    

    再查看下整个.bed文件,原来从第一行就开始不一致了

    cat Rat230_2_probe.bed|less -SN
          1 chr6    108169080       108169105       Rat230_2:1367452_at;    1       -
          2 chr5    15325895        15325920        Rat230_2:1367452_at;    1       +
          3 chr10   105608591       105608616       Rat230_2:1367452_at;    1       -
          4 chr5    15325937        15325962        Rat230_2:1367452_at;    1       +
          5 chr5    15325986        15326011        Rat230_2:1367452_at;    1       +
          6 chr5    15326001        15326026        Rat230_2:1367452_at;    1       +
          7 chr6    108168877       108168902       Rat230_2:1367452_at;    1       -
          8 chr6    108168798       108168823       Rat230_2:1367452_at;    1       -
          9 chr6    108168774       108168799       Rat230_2:1367452_at;    1       -
         10 chr10   105608337       105608362       Rat230_2:1367452_at;    1       -
         11 chr11   81380587        81380612        Rat230_2:1367452_at;    1       +
    

    只能goggle了,果然是因为染色体命名的方式不同,.bed文件中染色体命名都是以chr开头,而.gtf文件中都是以1,2,3...等命名,这就好办了,将.bed中的chr全部删除

    cat Rat230_2_probe.bed|sed 's/chr//' > x_chr.bed
    cat x_chr.bed |less -SN
          1 6       108169080       108169105       Rat230_2:1367452_at;    1       -
          2 5       15325895        15325920        Rat230_2:1367452_at;    1       +
          3 10      105608591       105608616       Rat230_2:1367452_at;    1       -
          4 5       15325937        15325962        Rat230_2:1367452_at;    1       +
          5 5       15325986        15326011        Rat230_2:1367452_at;    1       +
          6 5       15326001        15326026        Rat230_2:1367452_at;    1       +
          7 6       108168877       108168902       Rat230_2:1367452_at;    1       -
          8 6       108168798       108168823       Rat230_2:1367452_at;    1       -
          9 6       108168774       108168799       Rat230_2:1367452_at;    1       -
         10 10      105608337       105608362       Rat230_2:1367452_at;    1       -
    

    再使用intersectBed命令就OK了

    intersectBed -a x_chr.bed -b Rattus_norvegicus_lincRNA.gtf -wa -wb > Rattus_probe.txt
    

    相关文章

      网友评论

          本文标题:Rattus norvegicus基因表达探针lncRNA重注释

          本文链接:https://www.haomeiwen.com/subject/bktonqtx.html