美文网首页
Bioinformatics Data Skills - Pip

Bioinformatics Data Skills - Pip

作者: Pingouin | 来源:发表于2020-10-29 21:48 被阅读0次

    Pips in Action: creating simple programs with Grep and Pipes

    • The golden rule of bioinformatics is to not trust your tools or your data. The scepticism (怀疑态度) requires constant sanity checking intermediate results, which ensures your methods aren't biasing your data, or problems in data aren't being exacerbated(加剧) by methods.
    $ grep -v "^<" tb1.fasta | \
    # 移除fasta中的header lines(以<开头的)
    # -v invert
    # < caret symbol : start with
    # | pipe character: pipe the standard output to the next command  
    # \ continue the command on the next line 
    grep --color -i "[^ATCG]"
    # [^ATCG]  caret symbol: when used in brackets, a caret symbol matches any character that's **not** one of the characters in these brackets 
    # -i ignore the case a,t,c,g 
    # --color color the matching non-nucleotide characters 
    
    grep -v > tb1.fasta
    # 写成这种形式的话tb1.fasta文件会被覆盖
    

    Combining Pipes and Redirection

    • why do this step?
      Because aligners, assemblers, SNP callers will often use multiple streams simultaneously. Results(e.g. aligned reads, assembled contigs, or SNP calls) are output via the standard output stream while diagnostic messages, warning, error are output to the standard error stream.
    $ program1 input.txt 2> program.stderr | \ 
    program2 2> program2.stderr > result.txt
    # program 1's standard error stream is redirected to the program1.stderr logfile 
    # program2's output to results.txt
    

    相关文章

      网友评论

          本文标题:Bioinformatics Data Skills - Pip

          本文链接:https://www.haomeiwen.com/subject/igljvktx.html