grep 章
grep [options] pattern file
#常见参数:
#-w word 精确查找某个关键词 pattern
#-c count 统计匹配成功的行的数量
#-v 反向选择,即输出的是没匹配的行
#-n 显示匹配成功的行所在的行号
#-r 从目录中寻找pattern
#-e 指定多个匹配模式;是并列关系,不是这个基础之上的关系
#-f 从指定文件中读取要匹配的pattern
#-i 忽略大小写</pre>
正则表达式
是对字符串操作的一种逻辑公式,用事先定义好的一些特定字符以及这些特定字符的组合,组成一个"规则字符串",这个"规则字符串"用来表达对字符串的一种过滤逻辑
symbol | Mean |
---|---|
^ | 行首 |
$ | 行尾 |
. | 换行符之外的任意单个字符 |
? | 匹配之前项0次或者1次 |
+ | 匹配1次或者多次 |
* | 匹配0次或者多次 |
{n} | 匹配n次 |
{n,} | 匹配至少n次 |
{m,n} | 至少m,最多n |
[] | 匹配任意一个 |
[^] | 排除字符串 |
| | 或者 |
cat readme.txt
###this is the practice text
#Welcome to this world
#The Linux world is fascinating and amazing!
#You will get the regular expression practice
#And these are the necessary path to the Linux world!
cat readme.txt |grep 'd/pre>
#Welcome to this world
cat readme.txt |grep '^Y'
#You will get the regular expression practice
cat readme.txt |grep 'ing\?' #?之前必须匹配到最前面的一个就好了,这里是IN必须匹配的上,g可有可无
#The Linux world is fascinating and amazing!
#And these are the necessary path to the Linux world!
cat readme.txt |grep 'ing\+'
#The Linux world is fascinating and amazing!

cat readme.txt
###this is the practice text
Welcome to this world
The Linux world is fascinating and amazing!
You will get the regular expression practice
word
wolf
would
wood
woooo
wo
woooooooooooo
And these are the necessary path to the Linux world!</pre>
匹配次数
cat readme.txt |grep 'wo{2,4}'
wood
woooo
woooooooooooo
任意匹配1个
cat readme.txt |grep 'wo[ourl][df]'
word
wolf
wood
排除字符
cat readme.txt |grep -n '^[^wor]'
1:###this is the practice text
2:Welcome to this world
3:The Linux world is fascinating and amazing!
4:You will get the regular expression practice
12:And these are the necessary path to the Linux world!
####忽略大小写
cat readme.txt |grep -n -i '^[^wor]'
1:###this is the practice text
3:The Linux world is fascinating and amazing!
4:You will get the regular expression practice
12:And these are the necessary path to the Linux world!</pre>
练习题
匹配exon 或者 CDS的行
less /data/server/reference/gtf/gencode/gencode.v25.annotation.gtf|grep -w 'chrY'|grep -w 'exon\|CDS'
在Y染色体的注释文件中第三列有哪些类型
less /data/server/reference/gtf/gencode/gencode.v25.annotation.gtf|grep -w 'chrY'|cut -f 3|sort|uniq -c
#1997 CDS
# 4641 exon
#568 gene
# 217 start_codon
# 196 stop_codon
# 898 transcript
# 627 UTR
less /data/server/reference/gtf/gencode/gencode.v25.annotation.gtf|grep -w 'chrY'|cut -f 3|sort|uniq
#CDS
#exon
#gene
#start_codon
#stop_codon
#transcript
#UTR</pre>
网友评论