美文网首页基因组组装
syri-基因组sv检测及可视化软件

syri-基因组sv检测及可视化软件

作者: 花生学生信 | 来源:发表于2022-05-13 06:56 被阅读0次

    SyRI(Synteny and Rearrangement Identifier),是一款用于基因组结构变异检测及可视化的软件。
    SyRI首先查找重排区域,然后搜索序列中的差异,鉴别它们是否位于同位或重排区域。

    需要软件及环境

    minimap2
    nucmer
    python 3.5

    经过我不断探索后发现,直接运行/syri/syri/example/文件夹里的pipline.sh 可以快速配置环境。

    运行脚本

    syri=/public/home/lianglunping/liangtmp/syri/syri/syri/bin/syri
    
    
    ln -sf /public/home/fengting/task/5.12ragtag/dataR498/H7L1.fa refgenome
    ln -sf /public/home/fengting/task/5.12ragtag/dataR498/H7L26.fa qrygenome
    
    
    minimap2 -ax asm5 --eqx refgenome qrygenome > out.sam
    
    python3 $syri -c out.sam -r refgenome -q qrygenome -k -F S
    
    nucmer --maxmatch -c 100 -b 500 -l 50 refgenome qrygenome       # Whole genome alignment. Any other alignment can also be used.
    delta-filter -m -i 90 -l 100 out.delta > out.filtered.delta     # Remove small and lower quality alignments
    show-coords -THrd out.filtered.delta > out.filtered.coords      # Convert alignment information to a .TSV format as required by SyRI
    python3 $syri -c out.filtered.coords -d out.filtered.delta -r refgenome -q qrygenome
    
    /public/home/lianglunping/liangtmp/syri/syri/syri/bin/plotsr syri.out refgenome qrygenome  ###画图
    
    

    软件运行比较慢,如果想将染色体拆分开,可以运行一下代码:

    perl seq.perl chrlist Input > Output

    ###seq.perl
    ###useage:perl seq.perl chrlist Input >  Output
    #! /usr/bin/perl -w
    use strict;
    die "perl $0 <lst><fa>\n" unless  @ARGV==2;
    my ($lst,$fa)=@ARGV;
    open IN,$lst||die;
    my %ha;
    map{chomp;$ha{(split)[0]}=1}<IN>;
    close IN;
    
    $fa=~/gz$/?(open IN,"gzip -cd $fa|"||die):(open IN,$fa||die);
    $/=">";<IN>;$/="\n";
    my %out;
    while(<IN>){
        my $info=$1 if(/^(\S+)/);
        $/=">";
        my $seq=<IN>;
        $/="\n";
        $seq=~s/>|\r|\*//g;
        print ">$info\n$seq" if(exists $ha{$info} && ! exists $out{$info});
        $out{$info}=1;
    }
    close IN;
    
    拆分后的chr1比对结果
    ###plot.R
    ####绘制染色体共线性关系
    ###直接运行Rscript plot.R
    df <- read.table('out.filtered.coords',sep='\t')
    
    colnames(df) <- c("ref_start", "ref_end", "qry_start", "qry_end", "ref_len", "qry_len", 
                      "identiy", "ref_tag","qry_tag" )
    
    x_range <- range(c(df$ref_start, df$ref_end))
    y_range <- range(c(df$qry_start, df$qry_end))
    
    
    pdf('1.pdf')
    plot.new()
    
    plot.window(xlim = x_range,
                ylim = y_range)
    
    for( i in 1:nrow(df)){
      
      if (df[i,3] < df[i,4]){
        lines(x = df[i,1:2], y = df[i,3:4], col = "red")
      } else{
        lines(x = df[i,1:2], y = df[i,3:4], col = "blue")
      }
      
    }
    
    box()
    axis(1, at = seq(0, x_range[2], 10000), labels = seq(0, x_range[2], 10000) / 10000)
    axis(2, at = seq(0, y_range[2], 10000), labels = seq(0, y_range[2], 10000) / 10000)
    
    dev.off()
    
    mummer结果

    参考链接:

    SyRI:一款从组装的基因组中检测结构变异的实用软件 - 简书 (jianshu.com)

    SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies (biomedcentral.com)

    Pre-requisite for installing SyRI | syri (schneebergerlab.github.io)

    相关文章

      网友评论

        本文标题:syri-基因组sv检测及可视化软件

        本文链接:https://www.haomeiwen.com/subject/fsfeurtx.html