Pips in Action: creating simple programs with Grep and Pipes
- The golden rule of bioinformatics is to not trust your tools or your data. The scepticism (怀疑态度) requires constant sanity checking intermediate results, which ensures your methods aren't biasing your data, or problems in data aren't being exacerbated(加剧) by methods.
$ grep -v "^<" tb1.fasta | \
# 移除fasta中的header lines(以<开头的)
# -v invert
# < caret symbol : start with
# | pipe character: pipe the standard output to the next command
# \ continue the command on the next line
grep --color -i "[^ATCG]"
# [^ATCG] caret symbol: when used in brackets, a caret symbol matches any character that's **not** one of the characters in these brackets
# -i ignore the case a,t,c,g
# --color color the matching non-nucleotide characters
grep -v > tb1.fasta
# 写成这种形式的话tb1.fasta文件会被覆盖
Combining Pipes and Redirection
- why do this step?
Because aligners, assemblers, SNP callers will often use multiple streams simultaneously. Results(e.g. aligned reads, assembled contigs, or SNP calls) are output via the standard output stream while diagnostic messages, warning, error are output to the standard error stream.
$ program1 input.txt 2> program.stderr | \
program2 2> program2.stderr > result.txt
# program 1's standard error stream is redirected to the program1.stderr logfile
# program2's output to results.txt
网友评论