1. awk 匹配两个文件内容
### Linux Shell中使用awk完成两个文件的关联Join - lxw的大数据田地
$ cat file_1.txt
rna16036
rna16036
rna20975
rna20975
rna20555
rna20555
rna20555
rna20555
rna6952
rna6952
rna30045
rna30045
rna17485
rna16034
$ cat file_2.txt
rna4 model_evidence=Supporting evidence includes similarity to: 88%25 coverage of the annotated genomic feature by RNAseq alignments product=myosin-9-like
rna5 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments product=multiple epidermal growth factor-like domains protein 6
rna8 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 9 samples with support for all annotated introns product=multiple epidermal growth factor-like domains protein 9%2C transcript variant X3
rna9 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 8 samples with support for all annotated introns product=multiple epidermal growth factor-like domains protein 9%2C transcript variant X2
rna10 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 22 samples with support for all annotated introns product=multiple epidermal growth factor-like domains protein 9%2C transcript variant X1
rna11 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 4 samples with support for all annotated introns product=laminin-like protein epi-1%2C transcript variant X2
rna12 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 4 samples with support for all annotated introns product=laminin-like protein epi-1%2C transcript variant X1
rna13 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 5 samples with support for all annotated introns product=laminin-like protein epi-1%2C transcript variant X4
rna14 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 21 samples with support for all annotated introns product=laminin-like protein epi-1%2C transcript variant X3
rna15 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 12 samples with support for all annotated introns product=laminin-like protein epi-1%2C transcript variant X5
rna16 model_evidence=Supporting evidence includes similarity to: 3 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 10 samples with support for all annotated introns product=shematrin-like protein 2%2C transcript variant X1
rna17 model_evidence=Supporting evidence includes similarity to: 3 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 8 samples with support for all annotated introns product=shematrin-like protein 2%2C transcript variant X2
rna18 model_evidence=Supporting evidence includes similarity to: 1 Protein%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 4 samples with support for all annotated introns product=adult-specific rigid cuticular protein 15.5-like
rna19 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 14 samples with support for all annotated introns product=laminin subunit beta-4-like
rna20 model_evidence=Supporting evidence includes similarity to: 3 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 9 samples with support for all annotated introns product=probable peroxisomal membrane protein PEX13%2C transcript variant X2
rna21 model_evidence=Supporting evidence includes similarity to: 3 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 12 samples with support for all annotated introns product=probable peroxisomal membrane protein PEX13%2C transcript variant X1
rna22 model_evidence=Supporting evidence includes similarity to: 1 Protein%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 5 samples with support for all annotated introns product=adult-specific rigid cuticular protein 15.7-like
rna23 model_evidence=Supporting evidence includes similarity to: 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 19 samples with support for all annotated introns product=delta-like protein 1
rna24 model_evidence=Supporting evidence includes similarity to: 4 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 22 samples with support for all annotated introns product=glycine-rich cell wall structural protein 1.8%2C transcript variant X1
rna25 model_evidence=Supporting evidence includes similarity to: 4 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 4 samples with support for all annotated introns product=glycine-rich cell wall structural protein 1.8%2C transcript variant X2
...还有很多
如果文件小,可以在Excel中使用VLOOKUP进行匹配,但如果文件太大了还是得使用命令将两个文件内容匹配起来:
awk -F'\t' 'NR==FNR{a[$1]=$2;}NR!=FNR{print $0,a[$1]}' file_2.txt file_1.txt >file_3.txt
更多关于文件匹配的命令查看:
### linux grep命令- 风生水起- 博客园
### Linux egrep在文件内查找指定的字符串命令详解-Linux运维日志
### 使用grep完成两个文件内容的匹配- shishui07的博客- CSDN博客
2. scp从一个服务器向另一个服务器传文件
# 传文件
scp local_file remote_username@remote_ip:remote_folder
# 传文件夹/目录
scp -r local_folder remote_username@remote_ip:remote_folder
3.grep匹配并抓取字符所在的行
grep -r 'CYP' file1 >file2
grep -r -i 'CYP' file1 >file2 # -i参数不区分大小写
4.du查看文件大小
du -sh
5.前台任务挂后台
# 任务先运行起来
# 按ctrl+z # 挂掉任务,同时可以看到挂掉的任务号,假如此时的任务号为1
bg 1 # 现在就可以挂后台了
网友评论