22. 三剑客之awk

作者: 随便写写咯 | 来源:发表于2022-09-19 23:03 被阅读0次

    1. AWK基础

    1.1 AWK工作原理和基本用法说明

    AWK:Aho, Weinberger, Kernighan,报告生成器,格式化文本输出,GNU/Linux发布的AWK目前由自由软件基金会(FSF)进行开发和维护,通常也称它为 GNU AWK

    有多种版本:

    • AWK:原先来源于 AT & T 实验室的的AWK
    • NAWK:New awk,AT & T 实验室的AWK的升级版
    • GAWK:即GNU AWK. 所有的GNU/Linux发布版都自带GAWK,它与AWK和NAWK完全兼容

    gawk:模式扫描和处理语言,可以实现下面功能

    • 文本处理
    • 输出格式化的文本报表
    • 执行算数运算
    • 执行字符串操作

    格式:

    awk [options]   'program' var=value   file…
    awk [options]   -f programfile var=value file…
    

    说明:

    program通常是被放在单引号中,并可以由三种部分组成

    • BEGIN语句块
    • 模式匹配的通用语句块
    • END语句块

    格式:

    awk 选项 PATTERN'BEGIN{BEGIN ACTION}{文本处理 ACTION}ENG{END ACTION}' 文件路径
    

    常见选项:

    • -F “分隔符” 指明输入时用到的字段分隔符,默认的分隔符是若干个连续空白符
    • -v var=value 变量赋值; 即可定义内置变量, 也可定义自定义变量

    Program格式:

    pattern{action statements;..}
    

    pattern:决定动作语句何时触发及触发事件,比如:BEGIN,END,正则表达式等

    如果省略了pattern, 那么就是对所有行做处理

    action statements:对数据进行处理,放在{}内指明,常见:print, printf

    如果省略了action, 那么就是对所有列做处理

    范例: 省略pattern和action. 如果省略了action, 那么program内的关系表达式必须返回真(非0值, 非空字符串). 否则不会对文本处理

    [root@demo-c8 ~]# awk '' /etc/fstab 
    
    [root@demo-c8 ~]# awk '0' /etc/fstab 
    [root@demo-c8 ~]# awk '1' /etc/fstab 
    
    #
    # /etc/fstab
    # Created by anaconda on Mon Aug 15 16:52:19 2022
    #
    # Accessible filesystems, by reference, are maintained under '/dev/disk/'.
    # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
    #
    # After editing this file, run 'systemctl daemon-reload' to update systemd
    # units generated from this file.
    #
    UUID=b1ab1ace-2582-4afd-8693-39bd9855041c /                       xfs     defaults        0 0
    UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot                   ext4    defaults        1 2
    UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data                   xfs     defaults        0 0
    UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap                    swap    defaults        0 0
    
    [root@demo-c8 ~]# awk '"hello"' /etc/fstab 
    
    #
    # /etc/fstab
    # Created by anaconda on Mon Aug 15 16:52:19 2022
    #
    # Accessible filesystems, by reference, are maintained under '/dev/disk/'.
    # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
    #
    # After editing this file, run 'systemctl daemon-reload' to update systemd
    # units generated from this file.
    #
    UUID=b1ab1ace-2582-4afd-8693-39bd9855041c /                       xfs     defaults        0 0
    UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot                   ext4    defaults        1 2
    UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data                   xfs     defaults        0 0
    UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap                    swap    defaults        0 0
    

    awk工作过程:

    image.png

    第一步:执行BEGIN{action;… }语句块中的语句

    第二步:从文件或标准输入(stdin)读取一行,然后执行pattern{ action;… }语句块,逐行扫描文件,从第一行到最后一行重复这个过程,直到文件全部被读取完毕

    第三步:当读至输入流末尾时,执行END{action;…}语句块

    BEGIN语句块在awk开始从输入流中读取行之前被执行,这是一个可选的语句块,比如变量初始化、打印输出表格的表头等语句通常可以写在BEGIN语句块中

    END语句块在awk从输入流中读取完所有的行之后即被执行,比如打印所有行的分析结果这类信息汇总都是在END语句块中完成,它也是一个可选语句块

    pattern语句块中的通用命令是最重要的部分,也是可选的. 如果没有提供pattern语句块,则默认执行{ print },即打印每一个读取到的行,awk读取的每一行都会执行该语句块

    分隔符, 域和记录:

    awk会把读入的文件或者标准输入, 当做一个表格格式来处理. 默认按照\n来区分两行, 当然也可以自定义如何划分不同的行. 比如: 自定义;为分隔符, 那么;前面的为一行, ;后面的为一行

    • 由分隔符分隔的字段(列column,域field)标记$1,$2...$n称为域标识,$0为所有域,注意:和Shell中变量$符含义不同
    $1: 第一列
    $2: 第二列
    ...
    $0: 所有列
    
    • 文件的每一行称为记录record
    • 如果省略action,则默认执行 print $0 的操作, 也就是对所有列, 做处理

    常用的action分类:

    • output statements:print,printf
    • Expressions:算术,比较表达式等
    • Compound statements:组合语句
    • Control statements:if, while等
    • input statements

    awk控制语句:

    • { statements;… } 组合语句
    • if(condition) {statements;…}
    • if(condition) {statements;…} else {statements;…}
    • while(conditon) {statments;…}
    • do {statements;…} while(condition)
    • for(expr1;expr2;expr3) {statements;…}
    • break
    • continue
    • exit

    1.2 动作print

    格式:

    print item1, item2, ...
    

    说明:

    • 逗号分隔符
    • 输出item可以是字符串,也可是数值;当前记录的字段、变量或awk的表达式
    • 如果省略item,相当于print $0
    • 固定字符需要用""引起来,而变量和数字不需要
    abc: 变量
    "abc": 纯字符串
    

    范例: print默认会对传给awk的标准输入做打印, 也就是打印整行, $0

    root@u18:~# awk '{print}'
    aa
    aa
    bb
    bb
    cc
    cc
    
    
    root@u18:~# cat /etc/fstab | awk '{print}'
    # /etc/fstab: static file system information.
    #
    # Use 'blkid' to print the universally unique identifier for a
    # device; this may be used with UUID= as a more robust way to name devices
    # that works even if disks are added and removed. See fstab(5).
    #
    # <file system> <mount point>   <type>  <options>       <dump>  <pass>
    # / was on /dev/sda1 during installation
    UUID=f906b5aa-3e5b-4e12-8f11-55f55c41e1b0 /               ext4    errors=remount-ro 0       1
    # /boot was on /dev/sda2 during installation
    UUID=24239793-a342-4d7f-8773-e7381727a5dd /boot           ext4    defaults        0       2
    # /data was on /dev/sda4 during installation
    UUID=a328accf-9575-4343-976b-751c27cdb8ec /data           ext4    defaults        0       2
    # swap was on /dev/sda5 during installation
    UUID=0f94202e-4796-4835-b329-75425a807dcd none            swap    sw              0       0
    
    root@u18:~# awk '{print}' < /etc/fstab 
    # /etc/fstab: static file system information.
    #
    # Use 'blkid' to print the universally unique identifier for a
    # device; this may be used with UUID= as a more robust way to name devices
    # that works even if disks are added and removed. See fstab(5).
    #
    # <file system> <mount point>   <type>  <options>       <dump>  <pass>
    # / was on /dev/sda1 during installation
    UUID=f906b5aa-3e5b-4e12-8f11-55f55c41e1b0 /               ext4    errors=remount-ro 0       1
    # /boot was on /dev/sda2 during installation
    UUID=24239793-a342-4d7f-8773-e7381727a5dd /boot           ext4    defaults        0       2
    # /data was on /dev/sda4 during installation
    UUID=a328accf-9575-4343-976b-751c27cdb8ec /data           ext4    defaults        0       2
    # swap was on /dev/sda5 during installation
    UUID=0f94202e-4796-4835-b329-75425a807dcd none            swap    sw              0       0
    

    范例: awk可以直接打印文件的全部内容

    root@u18:~# awk '{print}' /etc/fstab 
    # /etc/fstab: static file system information.
    #
    # Use 'blkid' to print the universally unique identifier for a
    # device; this may be used with UUID= as a more robust way to name devices
    # that works even if disks are added and removed. See fstab(5).
    #
    # <file system> <mount point>   <type>  <options>       <dump>  <pass>
    # / was on /dev/sda1 during installation
    UUID=f906b5aa-3e5b-4e12-8f11-55f55c41e1b0 /               ext4    errors=remount-ro 0       1
    # /boot was on /dev/sda2 during installation
    UUID=24239793-a342-4d7f-8773-e7381727a5dd /boot           ext4    defaults        0       2
    # /data was on /dev/sda4 during installation
    UUID=a328accf-9575-4343-976b-751c27cdb8ec /data           ext4    defaults        0       2
    # swap was on /dev/sda5 during installation
    UUID=0f94202e-4796-4835-b329-75425a807dcd none            swap    sw              0       0
    

    范例: 打印固定字符串. print接固定字符串, 就是打印固定内容

    root@u18:~# awk '{print "hello awk"}'
    aaa
    hello awk
    vvv
    hello awk
    ccc
    hello awk
    
    
    # seq 10, 表示awk打印print的固定字符串10次
    root@u18:~# seq 10 | awk '{print "hello awk"}'
    hello awk
    hello awk
    hello awk
    hello awk
    hello awk
    hello awk
    hello awk
    hello awk
    hello awk
    hello awk
    

    范例: 域分隔符, 默认为连续的空格. 分隔时默认会把多个连续的空格, 压缩成一个

    root@u18:~# df | awk '{print $5}' 
    Use%
    0%
    1%
    3%
    0%
    0%
    0%
    9%
    1%
    0%
    
    root@u18:~# df | awk '{print $5 }' | awk -F'%' '{print $1}'
    Use
    0
    1
    3
    0
    0
    0
    9
    1
    0
    

    范例: 自定义域分隔符, -F选项

    root@u18:~# awk -F":" '{print $1,$3}' /etc/passwd
    root 0
    daemon 1
    bin 2
    sys 3
    sync 4
    games 5
    ...
    

    范例: 文本分隔后, 默认会用空格作为列的分隔符. 也可以指定新的分隔符

    # 新的分隔符在print里, 必须用双引号括起来
    root@u18:~# awk -F':' '{print $1":"$3}' /etc/passwd
    root:0
    daemon:1
    bin:2
    sys:3
    sync:4
    games:5
    ...
    

    范例: 指定table键为输出时的分隔符

    root@u18:~# awk -F: '{print $1"\t"$3}' /etc/passwd
    root    0
    daemon  1
    bin 2
    sys 3
    sync    4
    games   5
    man 6
    lp  7
    mail    8
    news    9
    ...
    

    范例: 统计一个网站访问量最大的前5个ip

    root@u18:~# sed -nr '1,5p' access_log 
    172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET / HTTP/1.1" 200 912 "-" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
    172.18.118.91 - - [20/May/2018:08:09:59 +0800] "POST /webnoauth/model.cgi HTTP/1.1" 404 293 "http://172.18.0.1/webnoauth/model.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
    172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET /router/get_rand_key.cgi HTTP/1.1" 404 297 "http://172.18.0.1/router/get_rand_key.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
    172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET /router/get_rand_key.cgi HTTP/1.1" 404 297 "http://172.18.0.1/router/get_rand_key.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
    172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET /router/get_rand_key.cgi HTTP/1.1" 404 297 "http://172.18.0.1/router/get_rand_key.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
    
    root@u18:~# awk '{print $1}' access_log | sort  | uniq -c | sort -k1 -nr | head -n5 
       4870 172.20.116.228
       3429 172.20.116.208
       2834 172.20.0.222
       2613 172.20.112.14
       2267 172.20.0.227
    

    范例: 分隔文本时, 可以指定多个分隔符. 此时, 凡是被指定分隔符隔开的, 都是单独的一列. 这样可以避免因为分隔符不同, 需要多次对文本进行处理的情况

    # [[:space:]]+|%: 扩展正则表达式, 表示一个以上包括一个空格, 或者%都作为分隔符
    root@u18:~# df | awk -F"[[:space:]]+|%" '{print $5}'
    Use
    0
    1
    3
    0
    0
    0
    9
    1
    0
    

    范例: 文件host_list.log 如下格式,请提取”.xxx.com”前面的主机名部分并写回到该文件中

    1 www.xxx.com
    2 blog.xxx.com
    3 study.xxx.com
    4 linux.xxx.com
    5 python.xxx.com
    
    root@u18:/opt# awk -F"[ .]" '{print $2}' host_list.org  
    www
    blog
    study
    linux
    python
    root@u18:/opt# awk -F"[ .]" '{print $2}' host_list.org  >> host_list.org
    root@u18:/opt# cat host_list.org
    1 www.xxx.com
    2 blog.xxx.com
    3 study.xxx.com
    4 linux.xxx.com
    5 python.xxx.com
    www
    blog
    study
    linux
    python
    

    1.3 AWK变量

    awk中的变量分为:内置和自定义变量

    变量定义格式:

    -v var=value 变量赋值; 即可定义内置变量, 也可定义自定义变量
    

    awk变量的引用格式:

    -v FS=":"
    引用awk变量不用写$, 直接写FS即可
    

    常见的内置变量:

    • FS(Field Separater): 输入字段分隔符,默认为空格,功能相当于 -F. 但是-F 和 FS变量功能一样,同时使用会冲突

    范例: 指定FS为":"

    root@u18:/opt# awk -v FS=":" '{print $1,$2}' /etc/passwd
    root x
    daemon x
    bin x
    sys x
    sync x
    games x
    ...
    

    范例: 输出分隔符变量也可引用输入分隔符变量

    root@u18:/opt# awk -v FS=":" '{print $1FS$2}' /etc/passwd
    root:x
    daemon:x
    bin:x
    sys:x
    sync:x
    games:x
    ...
    

    范例: awk也可以引用Shell的变量

    root@u18:/opt# var=":"; awk -v FS=$var '{print $1FS$2}' /etc/passwd
    root:x
    daemon:x
    bin:x
    sys:x
    sync:x
    games:x
    ...
    
    • OFS: 输出字段分隔符,默认为空格

    范例: 指定OFS为"===="

    root@u18:/opt# awk -F":" -v  OFS="======" '{print $1,$2}' /etc/passwd
    root======x
    daemon======x
    bin======x
    sys======x
    sync======x
    games======x
    ...
    
    root@u18:/opt# awk -v FS=":" -v  OFS="======" '{print $1, $2}' /etc/passwd
    root======x
    daemon======x
    bin======x
    sys======x
    ...
    
    • RS:输入记录record分隔符,指定输入时的换行符. 默认为\n

    范例: 指定";"为record分隔符

    root@u18:/opt# vim f1.txt
    a,b,c;11,22
    33,44;xx,yy,zz
    m,n;xxx
    
    # 指定分号为record分隔符, 那么a b c是一行, 11 22 33 44是一行, xx yy zz m n是一行, xxx是一行
    # 但是因为22, zz, xxx后本身就有换行符, 会继续保留, 所以才又换一次行
    root@u18:/opt# awk -v RS=";" '{print $0}' f1.txt 
    a,b,c
    11,22  # 文本中, 22后面有换行符, 会继续保留
    33,44
    xx,yy,zz
    m,n
    xxx # xxx后有换行符
    
    root@u18:/opt# 
    
    • NR:打印record的编号

    范例: NR变量可以显示record编号, 用于区分awk的record和Shell本身的换行

    root@u18:/opt# awk -v RS=";" '{print NR,$0}' f1.txt 
    1 a,b,c  # record1
    2 11,22 # record 2
    33,44
    3 xx,yy,zz # record3
    m,n
    4 xxx # record4
    
    root@u18:/opt# 
    
    • ORS:输出记录分隔符,输出时用指定符号代替换行符. 默认会用换行符

    范例: 指定"+++"为输出时record换行符

    root@u18:/opt# awk -v RS=";" -v ORS="+++" '{print NR,$0}' f1.txt 
    1 a,b,c+++2 11,22
    33,44+++3 xx,yy,zz
    m,n+++4 xxx
    +++root@u18:/opt#
    
    • NF:字段数量

    范例: NF显示的是, 根据输入列分隔符分隔后, 每一行有多少个字段. 所以一共有几行, 就会返回几行

    root@u18:/opt# awk -F":" '{print NF}' /etc/passwd
    root@u18:/opt# awk -F":" '{print NF}' /etc/passwd
    7
    7
    7
    7
    7
    7
    7
    7
    7
    7
    7
    7
    7
    ...
    

    $NF: 代表最后一个字段

    root@u18:/opt# awk -F":" '{print $NF}' /etc/passwd
    /bin/bash
    /usr/sbin/nologin
    /usr/sbin/nologin
    /usr/sbin/nologin
    ...
    

    $(NF-1): 代表倒数第二个字段

    root@u18:/opt# cat /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
    bin:x:2:2:bin:/bin:/usr/sbin/nologin
    sys:x:3:3:sys:/dev:/usr/sbin/nologin
    sync:x:4:65534:sync:/bin:/bin/sync
    games:x:5:60:games:/usr/games:/usr/sbin/nologin
    man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
    ...
    
    root@u18:/opt# awk -F":" '{print $(NF-1)}' /etc/passwd
    /root
    /usr/sbin
    /bin
    /dev
    /bin
    /usr/games
    /var/cache/man
    /var/spool/lpd
    /var/mail
    /var/spool/news
    /var/spool/uucp
    /bin
    ...
    

    范例: 取源码包的版本信息

    root@u18:/opt# tar cvf app.v1.tar.gz /etc
    root@u18:/opt# tar cvf app.v2.tar.gz /etc
    root@u18:/opt# tar cvf app.v3.tar.gz /etc
    root@u18:/opt# tar cvf app.v4.tar.gz /etc
    root@u18:/opt# tar cvf app.v5.tar.gz /etc
    root@u18:/opt# ll app*
    -rw-r--r-- 1 root root 3225600 Sep 17 22:06 app.v1.tar.gz
    -rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v2.tar.gz
    -rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v3.tar.gz
    -rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v4.tar.gz
    -rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v5.tar.gz
    
    # 以"."为输入分隔符, 取倒数第三个字段, 即为版本号
    
    root@u18:/opt# ls app.* | xargs -n1 |  awk -F"." '{print $(NF-2)}'
    v1
    v2
    v3
    v4
    v5
    
    # 这里其实不用xargs进行转置, 因为ls的结果虽然是以横行显示, 但本身是竖着的
    
    root@u18:/opt# ls app.* |  awk -F"." '{print $(NF-2)}'
    v1
    v2
    v3
    v4
    v5
    
    • FNR:各文件分别计数记录的编号. 默认情况下, 多个文件会统一编号

    范例: 利用FNR分别统计各文件记录的编号

    root@u18:/opt# awk -F":" '{print FNR,$1}' /etc/passwd /etc/group
    1 root
    2 daemon
    3 bin
    4 sys
    5 sync
    6 games
    7 man
    8 lp
    9 mail
    10 news
    11 uucp
    12 proxy
    13 www-data
    14 backup
    15 list
    16 irc
    17 gnats
    18 nobody
    19 systemd-network
    20 systemd-resolve
    21 syslog
    22 messagebus
    23 _apt
    24 lxd
    25 uuidd
    26 dnsmasq
    27 landscape
    28 sshd
    29 pollinate
    30 david
    1 root
    2 daemon
    3 bin
    4 sys
    5 adm
    6 tty
    7 disk
    8 lp
    9 mail
    10 news
    11 uucp
    12 man
    13 proxy
    14 kmem
    15 dialout
    16 fax
    17 voice
    18 cdrom
    19 floppy
    20 tape
    21 sudo
    22 audio
    23 dip
    24 www-data
    25 backup
    26 operator
    27 list
    28 irc
    29 src
    30 gnats
    31 shadow
    32 utmp
    33 video
    34 sasl
    35 plugdev
    36 staff
    37 games
    38 users
    39 nogroup
    40 systemd-journal
    41 systemd-network
    42 systemd-resolve
    43 input
    44 crontab
    45 syslog
    46 messagebus
    47 lxd
    48 mlocate
    49 uuidd
    50 ssh
    51 landscape
    52 david
    53 lpadmin
    54 sambashare
    
    • FILENAME:返回当前文件名, 配合FNR使用, 返回的record编号会标记属于哪个文件
    root@u18:/opt# awk -F":" '{print FNR,$1,FILENAME}' /etc/passwd /etc/group
    1 root /etc/passwd
    2 daemon /etc/passwd
    3 bin /etc/passwd
    4 sys /etc/passwd
    5 sync /etc/passwd
    6 games /etc/passwd
    7 man /etc/passwd
    8 lp /etc/passwd
    9 mail /etc/passwd
    10 news /etc/passwd
    11 uucp /etc/passwd
    12 proxy /etc/passwd
    13 www-data /etc/passwd
    14 backup /etc/passwd
    15 list /etc/passwd
    16 irc /etc/passwd
    17 gnats /etc/passwd
    18 nobody /etc/passwd
    19 systemd-network /etc/passwd
    20 systemd-resolve /etc/passwd
    21 syslog /etc/passwd
    22 messagebus /etc/passwd
    23 _apt /etc/passwd
    24 lxd /etc/passwd
    25 uuidd /etc/passwd
    26 dnsmasq /etc/passwd
    27 landscape /etc/passwd
    28 sshd /etc/passwd
    29 pollinate /etc/passwd
    30 david /etc/passwd
    1 root /etc/group
    2 daemon /etc/group
    3 bin /etc/group
    4 sys /etc/group
    5 adm /etc/group
    6 tty /etc/group
    7 disk /etc/group
    8 lp /etc/group
    9 mail /etc/group
    10 news /etc/group
    11 uucp /etc/group
    12 man /etc/group
    13 proxy /etc/group
    14 kmem /etc/group
    15 dialout /etc/group
    16 fax /etc/group
    17 voice /etc/group
    18 cdrom /etc/group
    19 floppy /etc/group
    20 tape /etc/group
    21 sudo /etc/group
    22 audio /etc/group
    23 dip /etc/group
    24 www-data /etc/group
    25 backup /etc/group
    26 operator /etc/group
    27 list /etc/group
    28 irc /etc/group
    29 src /etc/group
    30 gnats /etc/group
    31 shadow /etc/group
    32 utmp /etc/group
    33 video /etc/group
    34 sasl /etc/group
    35 plugdev /etc/group
    36 staff /etc/group
    37 games /etc/group
    38 users /etc/group
    39 nogroup /etc/group
    40 systemd-journal /etc/group
    41 systemd-network /etc/group
    42 systemd-resolve /etc/group
    43 input /etc/group
    44 crontab /etc/group
    45 syslog /etc/group
    46 messagebus /etc/group
    47 lxd /etc/group
    48 mlocate /etc/group
    49 uuidd /etc/group
    50 ssh /etc/group
    51 landscape /etc/group
    52 david /etc/group
    53 lpadmin /etc/group
    54 sambashare /etc/group
    
    • ARGC:返回awk命令行参数的个数

    范例: ARGV返回awk命令参数个数, '为第一个', /etc/passwd为第二个, /etc/group为第三个

    root@u18:/opt# awk -F":" '{print ARGC,FNR,$1,FILENAME}' /etc/passwd /etc/group
    3 1 root /etc/passwd
    3 2 daemon /etc/passwd
    3 3 bin /etc/passwd
    3 4 sys /etc/passwd
    3 5 sync /etc/passwd
    3 6 games /etc/passwd
    3 7 man /etc/passwd
    3 8 lp /etc/passwd
    3 9 mail /etc/passwd
    3 10 news /etc/passwd
    3 11 uucp /etc/passwd
    3 12 proxy /etc/passwd
    3 13 www-data /etc/passwd
    3 14 backup /etc/passwd
    3 15 list /etc/passwd
    3 16 irc /etc/passwd
    3 17 gnats /etc/passwd
    3 18 nobody /etc/passwd
    3 19 systemd-network /etc/passwd
    3 20 systemd-resolve /etc/passwd
    3 21 syslog /etc/passwd
    3 22 messagebus /etc/passwd
    3 23 _apt /etc/passwd
    3 24 lxd /etc/passwd
    3 25 uuidd /etc/passwd
    3 26 dnsmasq /etc/passwd
    3 27 landscape /etc/passwd
    3 28 sshd /etc/passwd
    3 29 pollinate /etc/passwd
    3 30 david /etc/passwd
    3 1 root /etc/group
    3 2 daemon /etc/group
    3 3 bin /etc/group
    3 4 sys /etc/group
    3 5 adm /etc/group
    3 6 tty /etc/group
    3 7 disk /etc/group
    3 8 lp /etc/group
    3 9 mail /etc/group
    3 10 news /etc/group
    3 11 uucp /etc/group
    3 12 man /etc/group
    3 13 proxy /etc/group
    3 14 kmem /etc/group
    3 15 dialout /etc/group
    3 16 fax /etc/group
    3 17 voice /etc/group
    3 18 cdrom /etc/group
    3 19 floppy /etc/group
    3 20 tape /etc/group
    3 21 sudo /etc/group
    3 22 audio /etc/group
    3 23 dip /etc/group
    3 24 www-data /etc/group
    3 25 backup /etc/group
    3 26 operator /etc/group
    3 27 list /etc/group
    3 28 irc /etc/group
    3 29 src /etc/group
    3 30 gnats /etc/group
    3 31 shadow /etc/group
    3 32 utmp /etc/group
    3 33 video /etc/group
    3 34 sasl /etc/group
    3 35 plugdev /etc/group
    3 36 staff /etc/group
    3 37 games /etc/group
    3 38 users /etc/group
    3 39 nogroup /etc/group
    3 40 systemd-journal /etc/group
    3 41 systemd-network /etc/group
    3 42 systemd-resolve /etc/group
    3 43 input /etc/group
    3 44 crontab /etc/group
    3 45 syslog /etc/group
    3 46 messagebus /etc/group
    3 47 lxd /etc/group
    3 48 mlocate /etc/group
    3 49 uuidd /etc/group
    3 50 ssh /etc/group
    3 51 landscape /etc/group
    3 52 david /etc/group
    3 53 lpadmin /etc/group
    3 54 sambashare /etc/group
    
    • ARGV:数组,保存的是命令行所给定的各参数,返回某一个参数用:ARGV[0],....... ARGV[0]返回awk命令本身, 超出参数个数, 返回空白

    范例: ARGV返回第n个参数

    root@u18:/opt# awk -F":" '{print ARGV[0]}' /etc/passwd /etc/group
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    awk
    root@u18:/opt# awk -F":" '{print ARGV[1]}' /etc/passwd /etc/group
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    /etc/passwd
    root@u18:/opt# awk -F":" '{print ARGV[2]}' /etc/passwd /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    /etc/group
    

    自定义变量(区分字符大小写):

    -v var=value 
    在program中直接定义
    

    范例: 在处理文件之前, 先打印一次变量, 用于制作表头等情况. BEGIN{这里的内容, 会在处理文本之前先执行一次}

    root@u18:/opt# awk 'BEGIN{test="hello,world"; print test}'
    hello,world
    
    root@u18:/opt# awk -v test="hello,awk" 'BEGIN{print test}'
    hello,awk
    
    root@u18:/opt# awk -v test="hello,awk" 'BEGIN{print test; test="hello,world";print test}'
    hello,awk
    hello,world
    

    范例: 在awk的program中, 可以使用链式赋值, 但是在awk外部不支持

    root@u18:/opt# awk  'BEGIN{test1=test2="hello,world"; print test1; print test2; print "hello,awk"}'
    hello,world
    hello,world
    hello,awk
    

    范例: awk打印多个字符串, 可以只写一个print命令

    root@u18:/opt# awk  'BEGIN{test1=test2="hello,world"; print test1,test2,"hello,awk"}'
    hello,world hello,world hello,awk
    

    范例: awk的变量不支持先引用, 后赋值

    root@u18:/opt# awk -F":" '{sex="male";print $1,sex,age;age=18}' /etc/passwd
    root male # 第一行记录的age为空, 说明处理第一行时, 运行到print age发现age没赋值, 所以返回空
    daemon male 18 # 第二行开始有age的值, 因为第一行处理完, age=18就执行了, 所以第二行开始都是有age值的
    bin male 18
    sys male 18
    sync male 18
    games male 18
    man male 18
    lp male 18
    mail male 18
    news male 18
    uucp male 18
    proxy male 18
    www-data male 18
    backup male 18
    list male 18
    irc male 18
    gnats male 18
    nobody male 18
    systemd-network male 18
    systemd-resolve male 18
    syslog male 18
    messagebus male 18
    _apt male 18
    lxd male 18
    uuidd male 18
    dnsmasq male 18
    landscape male 18
    sshd male 18
    pollinate male 18
    david male 18
    

    范例: 可以把awk的program部分写到文件里, 之后在命令中通过-f选项调用

    root@u18:/opt# vim awk.txt
    {print $(NF-1)}        
    
    root@u18:/opt# awk -F":" -f awk.txt /etc/passwd
    /root
    /usr/sbin
    /bin
    /dev
    /bin
    /usr/games
    /var/cache/man
    /var/spool/lpd
    /var/mail
    /var/spool/news
    /var/spool/uucp
    /bin
    /var/www
    /var/backups
    /var/list
    /var/run/ircd
    /var/lib/gnats
    /nonexistent
    /run/systemd/netif
    /run/systemd/resolve
    /home/syslog
    /nonexistent
    /nonexistent
    /var/lib/lxd/
    /run/uuidd
    /var/lib/misc
    /var/lib/landscape
    /run/sshd
    /var/cache/pollinate
    /home/david
    

    1.4 动作printf

    printf可以实现格式化输出

    格式:

    # 在'program中书写'
    printf "FORMAT", item1, item2,...
    

    说明:

    • 必须指定FORMAT
    • 不会自动换行, 需要显示给出换行控制符\n
    • FORMAT中需要分别为后面每个item指定格式符

    格式符: 与item一一对应

    %c:显示字符的ASCII码
    %d, %i:显示十进制整数
    %e, %E:显示科学计数法数值 
    %f:显示为浮点数
    %g, %G:以科学计数法或浮点形式显示数值
    %s:显示字符串
    %u:无符号整数
    %%:显示%自身
    

    修饰符:

    #[.#] 第一个数字控制显示的宽度;第二个#表示小数点后精度,如:%3.1f
    -     左对齐(默认右对齐) 如:%-15s
    +     显示数值的正负符号   如:%+d
    

    范例:

    awk -F:   '{printf "%s",$1}' /etc/passwd
    awk -F:   '{printf "%s\n",$1}' /etc/passwd
    awk -F:   '{printf "%20s\n",$1}' /etc/passwd
    awk -F:   '{printf "%-20s\n",$1}' /etc/passwd
    awk -F:   '{printf "%-20s %10d\n",$1,$3}' /etc/passwd
    awk -F:   '{printf "Username: %s\n",$1}' /etc/passwd
    awk -F:   '{printf “Username: %sUID:%d\n",$1,$3}' /etc/passwd
    awk -F:   '{printf "Username: %25sUID:%d\n",$1,$3}' /etc/passwd
    awk -F:   '{printf "Username: %-25sUID:%d\n",$1,$3}' /etc/passwd
    
    • %s指代的就是$1的内容
    [19:20:08 root@centos7 ~]#awk -F: '{printf "%s", $1}' /etc/passwd
    rootbindaemonadmlpsyncshutdownhaltmailoperatorgamesftpnobodysystemd-networkdbuspolkitdsshdpostfixtcpdump[19:20:38 root@centos7 ~]#
    
    • 换行符需要显示指定\n
    [19:20:03 root@centos7 ~]#awk -F: '{printf "%s\n", $1}' /etc/passwd
    root
    bin
    daemon
    adm
    lp
    sync
    shutdown
    halt
    mail
    operator
    games
    ftp
    nobody
    systemd-network
    dbus
    polkitd
    sshd
    postfix
    tcpdump
    
    [19:21:16 root@centos7 ~]#awk -F":" '{printf "%20s\n",$1}' /etc/passwd
                    root
                     bin
                  daemon
                     adm
                      lp
                    sync
                shutdown
                    halt
                    mail
                operator
                   games
                     ftp
                  nobody
         systemd-network
                    dbus
                 polkitd
                    sshd
                 postfix
                 tcpdump
    
    [19:22:15 root@centos7 ~]#awk -F":" '{printf "%-20s\n",$1}' /etc/passwd
    root                
    bin                 
    daemon              
    adm                 
    lp                  
    sync                
    shutdown            
    halt                
    mail                
    operator            
    games               
    ftp                 
    nobody              
    systemd-network     
    dbus                
    polkitd             
    sshd                
    postfix             
    tcpdump             
    
    [19:22:58 root@centos7 ~]#awk -F":" '{printf "Username: %s\n",$1}' /etc/passwd
    Username: root
    Username: bin
    Username: daemon
    Username: adm
    Username: lp
    Username: sync
    Username: shutdown
    Username: halt
    Username: mail
    Username: operator
    Username: games
    Username: ftp
    Username: nobody
    Username: systemd-network
    Username: dbus
    Username: polkitd
    Username: sshd
    Username: postfix
    Username: tcpdump
    
    • 打印多列, 需要分别指定格式, 并且指定列的输出分隔符, 否则两列会连在一起
    [19:25:32 root@centos7 ~]#awk -F":" '{printf "Username: %s | UID: %d\n", $1, $3}' /etc/passwd
    Username: root | UID: 0
    Username: bin | UID: 1
    Username: daemon | UID: 2
    Username: adm | UID: 3
    Username: lp | UID: 4
    Username: sync | UID: 5
    Username: shutdown | UID: 6
    Username: halt | UID: 7
    Username: mail | UID: 8
    Username: operator | UID: 11
    Username: games | UID: 12
    Username: ftp | UID: 14
    Username: nobody | UID: 99
    Username: systemd-network | UID: 192
    Username: dbus | UID: 81
    Username: polkitd | UID: 999
    Username: sshd | UID: 74
    Username: postfix | UID: 89
    Username: tcpdump | UID: 72
    
    [19:26:34 root@centos7 ~]#awk -F":" '{printf "Username: %-25s UID: %d\n", $1, $3}' /etc/passwd
    Username: root                      UID: 0
    Username: bin                       UID: 1
    Username: daemon                    UID: 2
    Username: adm                       UID: 3
    Username: lp                        UID: 4
    Username: sync                      UID: 5
    Username: shutdown                  UID: 6
    Username: halt                      UID: 7
    Username: mail                      UID: 8
    Username: operator                  UID: 11
    Username: games                     UID: 12
    Username: ftp                       UID: 14
    Username: nobody                    UID: 99
    Username: systemd-network           UID: 192
    Username: dbus                      UID: 81
    Username: polkitd                   UID: 999
    Username: sshd                      UID: 74
    Username: postfix                   UID: 89
    Username: tcpdump                   UID: 72
    
    [root@demo-c8 ~]# awk -F":"  '{printf "%-20s %s\n", $1,$3 }' /etc/passwd
    root                 0
    bin                  1
    daemon               2
    adm                  3
    lp                   4
    sync                 5
    shutdown             6
    halt                 7
    mail                 8
    operator             11
    games                12
    ftp                  14
    nobody               65534
    dbus                 81
    systemd-coredump     999
    systemd-resolve      193
    tss                  59
    polkitd              998
    geoclue              997
    rtkit                172
    pulse                171
    libstoragemgmt       996
    qemu                 107
    usbmuxd              113
    unbound              995
    rpc                  32
    gluster              994
    chrony               993
    setroubleshoot       992
    pipewire             991
    saslauth             990
    dnsmasq              984
    radvd                75
    clevis               983
    cockpit-ws           982
    cockpit-wsinstance   981
    sssd                 980
    flatpak              979
    colord               978
    gdm                  42
    rpcuser              29
    gnome-initial-setup  977
    sshd                 74
    avahi                70
    rngd                 976
    tcpdump              72
    wang                 1000
    
    • 打印title, 凑出表格形式
    [root@demo-c8 ~]# awk -F":"  'BEGIN{print "|---------------------------|\n|       Usename&UID         |\n-----------------------------"}{printf "|%-20s|%-6s|\n-----------------------------\n", $1,$3 }' /etc/passwd 
    |---------------------------|
    |       Usename&UID         |
    -----------------------------
    |root                |0     |
    -----------------------------
    |bin                 |1     |
    -----------------------------
    |daemon              |2     |
    -----------------------------
    |adm                 |3     |
    -----------------------------
    |lp                  |4     |
    -----------------------------
    |sync                |5     |
    -----------------------------
    |shutdown            |6     |
    -----------------------------
    |halt                |7     |
    -----------------------------
    |mail                |8     |
    -----------------------------
    |operator            |11    |
    -----------------------------
    |games               |12    |
    -----------------------------
    |ftp                 |14    |
    -----------------------------
    |nobody              |65534 |
    -----------------------------
    |dbus                |81    |
    -----------------------------
    |systemd-coredump    |999   |
    -----------------------------
    |systemd-resolve     |193   |
    -----------------------------
    |tss                 |59    |
    -----------------------------
    |polkitd             |998   |
    -----------------------------
    |geoclue             |997   |
    -----------------------------
    |rtkit               |172   |
    -----------------------------
    |pulse               |171   |
    -----------------------------
    |libstoragemgmt      |996   |
    -----------------------------
    |qemu                |107   |
    -----------------------------
    |usbmuxd             |113   |
    -----------------------------
    |unbound             |995   |
    -----------------------------
    |rpc                 |32    |
    -----------------------------
    |gluster             |994   |
    -----------------------------
    |chrony              |993   |
    -----------------------------
    |setroubleshoot      |992   |
    -----------------------------
    |pipewire            |991   |
    -----------------------------
    |saslauth            |990   |
    -----------------------------
    |dnsmasq             |984   |
    -----------------------------
    |radvd               |75    |
    -----------------------------
    |clevis              |983   |
    -----------------------------
    |cockpit-ws          |982   |
    -----------------------------
    |cockpit-wsinstance  |981   |
    -----------------------------
    |sssd                |980   |
    -----------------------------
    |flatpak             |979   |
    -----------------------------
    |colord              |978   |
    -----------------------------
    |gdm                 |42    |
    -----------------------------
    |rpcuser             |29    |
    -----------------------------
    |gnome-initial-setup |977   |
    -----------------------------
    |sshd                |74    |
    -----------------------------
    |avahi               |70    |
    -----------------------------
    |rngd                |976   |
    -----------------------------
    |tcpdump             |72    |
    -----------------------------
    |wang                |1000  |
    -----------------------------
    

    1.5 操作符

    算数运算符:

    x+y, x-y, x*y, x/y, x^y, x%y
    -x:转换为负数
    +x:将字符串转换为数值
    

    字符串操作符: 没有符号的操作符, 字符串连接

    赋值操作符:

    =, +=, -=, *=, /=, %=, ^=,++, --
    

    范例: 自增

    [19:29:30 root@centos7 ~]#awk 'BEGIN{i=0;print i++,i}' # 先打印i, 再自增, 然后重新打印
    0 1
    [19:29:49 root@centos7 ~]#awk 'BEGIN{i=0;print ++i,i}' # 先自增打印i, 然后再打印一遍i
    1 1
    

    比较操作符:

    ==, !=, >, >=, <, <=
    

    范例: 打印/etc/issue文件的第二行record

    [19:36:17 root@centos7 ~]#awk 'NR==2' /etc/issue
    Kernel \r on an \m
    

    范例: 打印UID>=1000的record

    [19:36:26 root@centos7 ~]#awk '$3>=1000' /etc/passwd
    systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin
    dbus:x:81:81:System message bus:/:/sbin/nologin
    polkitd:x:999:998:User for polkitd:/:/sbin/nologin
    

    范例: 取奇数, 偶数行

    [19:38:39 root@centos7 ~]#seq 10 | awk 'NR%2==0'
    2
    4
    6
    8
    10
    [19:39:16 root@centos7 ~]#seq 10 | awk 'NR%2==1'
    1
    3
    5
    7
    9
    [19:39:19 root@centos7 ~]#seq 10 | awk 'NR%2!=0'
    1
    3
    5
    7
    9
    

    范例: 取ip地址

    [19:45:44 root@centos7 ~]#ifconfig eth0
    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 192.168.192.7  netmask 255.255.255.0  broadcast 192.168.192.255
            inet6 fe80::20c:29ff:fe0d:c854  prefixlen 64  scopeid 0x20<link>
            ether 00:0c:29:0d:c8:54  txqueuelen 1000  (Ethernet)
            RX packets 3731  bytes 324962 (317.3 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 1901  bytes 235371 (229.8 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    [19:45:29 root@centos7 ~]#ifconfig eth0 | awk 'NR==2{print $2}'
    192.168.192.7
    

    模式匹配符:

    ~ 左边是否和右边匹配,包含关系. 配合正则表达式, 判断左边内容, 是否包含右边内容
    !~ 是否不匹配, 左边内容, 是否不包含右边内容
    

    格式:

    awk 选项 '$指定列范围 ~ /指定的列范围是否包含该字符串/{ACTION}' 文件路径
    

    范例:

    [root@demo-c8 opt]# awk -F":" '$0 ~ /root/{print $1}' /etc/passwd
    root
    operator
    

    条件表达式(三目表达式):

    selector?if-true-expression:if-false-expression
    
    • select: 模式匹配
    • if-true-expression: 行匹配成功, 则执行该命令
    • if-false-expression: 行匹配不成功, 则执行该命令

    范例: 计算每一行第三列的值是否大于等于1000, 如果是, 则给usertype赋值为"Common User", 如果不是, 赋值为"SysUser". 最后打印$1的内容, 以及usertype变量的值

    [root@demo-c8 opt]# awk -F: '{$3>=1000?usertype="Common User":usertype="SysUser";printf "%-20s:%12s\n",$1,usertype}'  /etc/passwd
    root                :     SysUser
    bin                 :     SysUser
    daemon              :     SysUser
    adm                 :     SysUser
    lp                  :     SysUser
    sync                :     SysUser
    shutdown            :     SysUser
    halt                :     SysUser
    mail                :     SysUser
    operator            :     SysUser
    games               :     SysUser
    ftp                 :     SysUser
    nobody              : Common User
    dbus                :     SysUser
    systemd-coredump    :     SysUser
    systemd-resolve     :     SysUser
    tss                 :     SysUser
    polkitd             :     SysUser
    geoclue             :     SysUser
    rtkit               :     SysUser
    pulse               :     SysUser
    libstoragemgmt      :     SysUser
    qemu                :     SysUser
    usbmuxd             :     SysUser
    unbound             :     SysUser
    rpc                 :     SysUser
    gluster             :     SysUser
    chrony              :     SysUser
    setroubleshoot      :     SysUser
    pipewire            :     SysUser
    saslauth            :     SysUser
    dnsmasq             :     SysUser
    radvd               :     SysUser
    clevis              :     SysUser
    cockpit-ws          :     SysUser
    cockpit-wsinstance  :     SysUser
    sssd                :     SysUser
    flatpak             :     SysUser
    colord              :     SysUser
    gdm                 :     SysUser
    rpcuser             :     SysUser
    gnome-initial-setup :     SysUser
    sshd                :     SysUser
    avahi               :     SysUser
    rngd                :     SysUser
    tcpdump             :     SysUser
    wang                : Common User
    

    1.6 模式PATTERN

    PATTERN: 根据pattern条件,过滤匹配的行,再做处理

    1. 如果未指定:空模式,匹配每一行

    范例:

    [root@demo-c8 ~]# awk -F":" '{print $1,$3}' /etc/passwd
    root 0
    bin 1
    daemon 2
    adm 3
    lp 4
    sync 5
    shutdown 6
    ...
    
    1. /regular expression/:仅处理能被模式匹配到的行,需要用/ /括起来
    • 如果省略了action, 那么默认会执行print $0
    [root@demo-c8 ~]# awk '/^UUID/' /etc/fstab
    UUID=b1ab1ace-2582-4afd-8693-39bd9855041c /                       xfs     defaults        0 0
    UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot                   ext4    defaults        1 2
    UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data                   xfs     defaults        0 0
    UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap                    swap    defaults        0 0
    

    范例: 取分区利用率数字

    [root@demo-c8 ~]# df
    Filesystem     1K-blocks    Used Available Use% Mounted on
    devtmpfs         3957244       0   3957244   0% /dev
    tmpfs            3985412       0   3985412   0% /dev/shm
    tmpfs            3985412    9832   3975580   1% /run
    tmpfs            3985412       0   3985412   0% /sys/fs/cgroup
    /dev/sda2       41922560 4561428  37361132  11% /
    /dev/sda5       41922560  325332  41597228   1% /data
    /dev/sda1         999320  192552    737956  21% /boot
    tmpfs             797080    1168    795912   1% /run/user/42
    tmpfs             797080       4    797076   1% /run/user/0
    
    [root@demo-c8 ~]# df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{print $5}'
    11
    1
    21
    
    [root@demo-c8 ~]# df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{print $5}' | sort |  df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{print $5}' | sort | tail -n1
    21
    
    1. relational expression: 关系表达式,结果为“真”才会被处理

    真:结果为非0值,非空字符串
    假:结果为空字符串或0值

    范例: !0表示真, !1表示假. 使用数字时, 不用加双引号""

    [root@demo-c8 ~]# awk '!0' /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    bin:x:1:1:bin:/bin:/sbin/nologin
    daemon:x:2:2:daemon:/sbin:/sbin/nologin
    adm:x:3:4:adm:/var/adm:/sbin/nologin
    lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
    sync:x:5:0:sync:/sbin:/bin/sync
    shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
    halt:x:7:0:halt:/sbin:/sbin/halt
    mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
    operator:x:11:0:operator:/root:/sbin/nologin
    games:x:12:100:games:/usr/games:/sbin/nologin
    ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
    nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
    dbus:x:81:81:System message bus:/:/sbin/nologin
    systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin
    systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin
    tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
    polkitd:x:998:996:User for polkitd:/:/sbin/nologin
    geoclue:x:997:995:User for geoclue:/var/lib/geoclue:/sbin/nologin
    rtkit:x:172:172:RealtimeKit:/proc:/sbin/nologin
    pulse:x:171:171:PulseAudio System Daemon:/var/run/pulse:/sbin/nologin
    libstoragemgmt:x:996:992:daemon account for libstoragemgmt:/var/run/lsm:/sbin/nologin
    qemu:x:107:107:qemu user:/:/sbin/nologin
    usbmuxd:x:113:113:usbmuxd user:/:/sbin/nologin
    unbound:x:995:990:Unbound DNS resolver:/etc/unbound:/sbin/nologin
    rpc:x:32:32:Rpcbind Daemon:/var/lib/rpcbind:/sbin/nologin
    gluster:x:994:989:GlusterFS daemons:/run/gluster:/sbin/nologin
    chrony:x:993:988::/var/lib/chrony:/sbin/nologin
    setroubleshoot:x:992:986::/var/lib/setroubleshoot:/sbin/nologin
    pipewire:x:991:985:PipeWire System Daemon:/var/run/pipewire:/sbin/nologin
    saslauth:x:990:76:Saslauthd user:/run/saslauthd:/sbin/nologin
    dnsmasq:x:984:984:Dnsmasq DHCP and DNS server:/var/lib/dnsmasq:/sbin/nologin
    radvd:x:75:75:radvd user:/:/sbin/nologin
    clevis:x:983:982:Clevis Decryption Framework unprivileged user:/var/cache/clevis:/sbin/nologin
    cockpit-ws:x:982:980:User for cockpit web service:/nonexisting:/sbin/nologin
    cockpit-wsinstance:x:981:979:User for cockpit-ws instances:/nonexisting:/sbin/nologin
    sssd:x:980:978:User for sssd:/:/sbin/nologin
    flatpak:x:979:977:User for flatpak system helper:/:/sbin/nologin
    colord:x:978:976:User for colord:/var/lib/colord:/sbin/nologin
    gdm:x:42:42::/var/lib/gdm:/sbin/nologin
    rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
    gnome-initial-setup:x:977:975::/run/gnome-initial-setup/:/sbin/nologin
    sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
    avahi:x:70:70:Avahi mDNS/DNS-SD Stack:/var/run/avahi-daemon:/sbin/nologin
    rngd:x:976:974:Random Number Generator Daemon:/var/lib/rngd:/sbin/nologin
    tcpdump:x:72:72::/:/sbin/nologin
    wang:x:1000:1000:wang:/home/wang:/bin/bash
    
    [root@demo-c8 ~]# awk '!1' /etc/passwd
    

    范例: !n++表达式

    1. awk先判断n的值, n=0, !n=1, 为真, 所以会处理第一行
    2. 处理完第一行, 计算n++, n=2, !n=0, 为假, 则不处理第二行
    3. 第二行虽然不处理, 但是因为是循环处理每一行, 所以之后还要判断n, n=2, n++=3, !n=0, 为假, 所以不处理第三行, 后续所有行因为n都是为0, !n则为假, 也就都不处理了
    [root@demo-c8 ~]# awk -v n=0 '!n++' /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    

    范例: !++n表达式

    1. n=0, 先计算++n, n=1, !n=0, 为假, 则不处理文本
    [root@demo-c8 opt]# awk -v n=0 '!++n' /etc/passwd 
    [root@demo-c8 opt]# 
    
    

    总结: 当pattern使用关系表达式时, 处理文本就相当于循环处理每一行, 每次处理一行内容之前, 要先判断表达式的返回值, 返回真, 则处理, 返回假, 则不处理. 处理完一行, 接着再次判断表达式的返回值, 然后再处理或不处理

    范例: i=!i, 打印奇数行

    1. i起初没有赋值, 为假, 那么!i就为真, 所以会处理第一行, 此时i=真
    2. 之后, 因为i为真, 那么!i就为假, 所以不会处理第二行, 此时i=假
    3. 接着再判断表达式, 此时i=假, 那么!i则为真, 处理第三行, 也就是只处理奇数行
    [root@demo-c8 opt]# seq 10 | awk 'i=!i' 
    1
    3
    5
    7
    9
    

    范例: 打印偶数行

    先将i初始化为1, 真, 那么!i就是假, 和上面相反, 结果就是只处理偶数行

    [root@demo-c8 opt]# seq 10 | awk -v i=1 'i=!i' 
    2
    4
    6
    8
    10
    
    [root@demo-c8 opt]# seq 10 | awk '!(i=!i)' 
    2
    4
    6
    8
    10
    
    1. line ranges:行范围

    不支持直接用行号,但可以使用变量NR间接指定行号

    范例:

    [root@demo-c8 opt]# seq 10 | awk 'NR>=3 && NR<=6'
    3
    4
    5
    6
    

    /pat1/,/pat2/ 不支持直接给出数字格式, 但支持字符匹配

    范例:

    [root@demo-c8 opt]# awk '/^root/,/^nobody/' /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    bin:x:1:1:bin:/bin:/sbin/nologin
    daemon:x:2:2:daemon:/sbin:/sbin/nologin
    adm:x:3:4:adm:/var/adm:/sbin/nologin
    lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
    sync:x:5:0:sync:/sbin:/bin/sync
    shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
    halt:x:7:0:halt:/sbin:/sbin/halt
    mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
    operator:x:11:0:operator:/root:/sbin/nologin
    games:x:12:100:games:/usr/games:/sbin/nologin
    ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
    nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
    

    2. AWK条件判断和循环控制

    2.1 条件判断if-else

    语法:

    if(condition)statement
    if(condition){statement;…}[{else statement}
    if(condition1){statement1}else if(condition2){statement2}else if(condition3){statement3}......else{statementN}
    

    使用场景: 对awk取得的整行或某个字段做条件判断

    范例: '{if(condition)statement}'

    [root@demo-c8 opt]# awk -F":" '{if($NF=="/bin/bash")print $1}' /etc/passwd
    root
    wang
    
    [root@demo-c8 opt]# awk -F":" '{if($NF>5)print $0}' /etc/fstab 
    UUID=b1ab1ace-2582-4afd-8693-39bd9855041c /                       xfs     defaults        0 0
    UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot                   ext4    defaults        1 2
    UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data                   xfs     defaults        0 0
    UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap                    swap    defaults        0 0
    
    [root@demo-c8 opt]# awk -F":" '{if($3>=1000)print $1,$3}' /etc/passwd
    nobody 65534
    wang 1000
    

    范例: '{if(condition1){statement1}else{statement2}}'

    [root@demo-c8 opt]# awk -F":" '{if($3>=1000){printf "%-20s %s\n", $1,$3}else{printf "%-20s %s\n", $1, "common user"}}' /etc/passwd
    root                 common user
    bin                  common user
    daemon               common user
    adm                  common user
    lp                   common user
    sync                 common user
    shutdown             common user
    halt                 common user
    mail                 common user
    operator             common user
    games                common user
    ftp                  common user
    nobody               65534
    dbus                 common user
    systemd-coredump     common user
    systemd-resolve      common user
    tss                  common user
    polkitd              common user
    geoclue              common user
    rtkit                common user
    pulse                common user
    libstoragemgmt       common user
    qemu                 common user
    usbmuxd              common user
    unbound              common user
    rpc                  common user
    gluster              common user
    chrony               common user
    setroubleshoot       common user
    pipewire             common user
    saslauth             common user
    dnsmasq              common user
    radvd                common user
    clevis               common user
    cockpit-ws           common user
    cockpit-wsinstance   common user
    sssd                 common user
    flatpak              common user
    colord               common user
    gdm                  common user
    rpcuser              common user
    gnome-initial-setup  common user
    sshd                 common user
    avahi                common user
    rngd                 common user
    tcpdump              common user
    wang                 1000
    

    范例: awk取分区利用率大于10%的分区

    [root@demo-c8 opt]# df 
    Filesystem     1K-blocks    Used Available Use% Mounted on
    devtmpfs         3957244       0   3957244   0% /dev
    tmpfs            3985412       0   3985412   0% /dev/shm
    tmpfs            3985412    9832   3975580   1% /run
    tmpfs            3985412       0   3985412   0% /sys/fs/cgroup
    /dev/sda2       41922560 4561796  37360764  11% /
    /dev/sda5       41922560  325332  41597228   1% /data
    /dev/sda1         999320  192552    737956  21% /boot
    tmpfs             797080    1168    795912   1% /run/user/42
    tmpfs             797080       4    797076   1% /run/user/0
    
    [root@demo-c8 opt]# df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{if($5>10)print $1, $5}'
    /dev/sda2 11
    /dev/sda1 21
    
    [ %]+ : 表示>=1个空格, 或者>=1个百分号
    [[:space:]]+|% : 表示>=1个空格, 或者一个百分号
    " +|%" : 表示>=1个空格, 或者>=1个百分号
    

    2.2 switch语句

    类似Shell中的case表达式

    语法:

    switch(expression) {case VALUE1 or /REGEXP/: statement1; case VALUE2 or 
    /REGEXP2/: statement2; ...; default: statementn}
    

    2.3 while循环

    条件“真”,进入循环;条件“假”,退出循环
    使用场景:

    对一行内的多个字段逐一类似处理时使用
    对数组中的各元素逐一处理时使用
    

    语法:

    while (condition) {statement;…}
    

    范例: length()内置函数, 返回字符个数

    [root@demo-c8 opt]# awk  'BEGIN{print length("您好世界")}' 
    4
    [root@demo-c8 opt]# awk  'BEGIN{print length("hello world")}' 
    11
    

    范例: awk逐列处理

    [20:08:29 root@centos-7-6 ~]#awk '/^[[:space:]]*linux16/{i=1;while(i<=NF){print $i,length($i); i++}}' /etc/grub2.cfg
    linux16 7
    /vmlinuz-3.10.0-1127.el7.x86_64 31
    root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
    ro 2
    crashkernel=auto 16
    rhgb 4
    quiet 5
    net.ifnames=0 13
    linux16 7
    /vmlinuz-0-rescue-e5ad2c92a25b4c87b7cca719c77091ff 50
    root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
    ro 2
    crashkernel=auto 16
    rhgb 4
    quiet 5
    net.ifnames=0 13
    
    [20:08:37 root@centos-7-6 ~]#awk '/^[[:space:]]*linux16/{i=1;while(i<=NF) {if(length($i)>=10){print $i,length($i)}; i++}}' /etc/grub2.cfg
    /vmlinuz-3.10.0-1127.el7.x86_64 31
    root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
    crashkernel=auto 16
    net.ifnames=0 13
    /vmlinuz-0-rescue-e5ad2c92a25b4c87b7cca719c77091ff 50
    root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
    crashkernel=auto 16
    net.ifnames=0 13
    

    范例: 打印1,2,..100

    [root@demo-c8 opt]# awk -v i=1 'BEGIN{while(i<=100){print i;i++}}'
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    

    注意:

    1. awk在打印变量时, 不用加`$`符合. 位置变量除外
    

    范例: 打印1-100的和

    [root@demo-c8 opt]# awk -v i=1 -v sum=0 'BEGIN{while (i<=100){sum+=i; i++};print sum}'
    5050
    

    2.4 循环do-while

    语法:

    do {statement;…}while(condition)
    

    意义: 无论真假,至少执行一次循环体

    do-while循环

    语法:do {statement;…}while(condition)
    意义:无论真假,至少执行一次循环体
    

    范例:

    [root@demo-c8 opt]# awk 'BEGIN{ total=0;i=1;do{ total+=i;i++;}while(i<=100);print total}'
    5050
    

    2.5 循环for

    语法:

    for(expr1;expr2;expr3) {statement;…}
    

    常见用法:

    for(variable assignment;condition;iteration process) {for-body}
    

    特殊用法:能够遍历数组中的元素

    范例: awk的for循环, 打印1-100

    [root@demo-c8 opt]# awk 'BEGIN{for(i=1;i<=100;i++){print i}}'
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    

    范例: awk的for循环, 计算1-100的和

    [root@demo-c8 opt]# awk 'BEGIN{sum=0;for(i=1;i<=100;i++){sum+=i};print sum}'
    5050
    

    范例: awk, Shell, bc效率对比, 计算1+...+1000000的和

    [root@demo-c8 opt]# time (awk 'BEGIN{sum=0;for(i=1;i<=1000000;i++){sum+=i};print sum}')
    500000500000
    
    real    0m0.079s
    user    0m0.076s
    sys 0m0.003s
    
    [root@demo-c8 opt]# time (sum=0; for((i=1;i<=1000000;i++));do let sum+=i;done;echo $sum)
    500000500000
    
    real    0m5.446s
    user    0m5.408s
    sys 0m0.000s
    
    [root@demo-c8 opt]# time (seq -s+ 1000000 |bc)
    500000500000
    
    real    0m0.432s
    user    0m0.343s
    sys 0m0.150s
    

    2.6 continue和break

    continue 中断本次循环
    break 中断整个循环

    格式:

    continue [n]
    break [n]
    

    范例: continue, 打印1-100奇数的和

    [root@demo-c8 opt]# awk -v sum=0 'BEGIN{for(i=1;i<=100;i++){if(i%2==0)continue; sum+=i};print sum}'
    2500
    

    范例: break, 打印1-49

    [root@demo-c8 opt]# awk 'BEGIN{for(i=1;i<=100;i++){if (i<=50){break}else{print i}}}'
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    

    2.7 next

    next 可以提前结束对本行处理而直接进入下一行处理(awk自身循环)

    范例: 打印UID是奇数的用户名和UID

    [root@demo-c8 opt]# awk -F":" '{if($3%2!=0)printf "%-20s %s\n", $1,$3}' /etc/passwd
    bin                  1
    adm                  3
    sync                 5
    halt                 7
    operator             11
    dbus                 81
    systemd-coredump     999
    systemd-resolve      193
    tss                  59
    geoclue              997
    pulse                171
    qemu                 107
    usbmuxd              113
    unbound              995
    chrony               993
    pipewire             991
    radvd                75
    clevis               983
    cockpit-wsinstance   981
    flatpak              979
    rpcuser              29
    gnome-initial-setup  977
    

    3. 数组

    awk的数组都是关联数组

    格式:

    array_name[index-expression]
    

    范例:

    [root@demo-c8 opt]# weekdays["mon"]="monday"
    [root@demo-c8 opt]# echo ${weekdays["mon"]}
    monday
    

    index-expression

    • 利用数组,实现 k/v 功能
    • 可使用任意字符串;字符串要使用双引号括起来
    • 如果某数组元素(key)事先不存在,在引用时,awk会自动创建此元素,并将其值初始化为空串("")
    • 若要判断数组中是否存在某元素,要使用index in array格式进行遍历. 存在返回1, 不存在返回0

    注意: awk的关联数组不支持[*], 取全部值

    [root@demo-c8 opt]# awk 'BEGIN{weekdays["mon"]="monday";weekdays["tue"]="tuesday";weekdays["wed"]="wednesday";print weekdays["mon"]}'
    monday
    

    范例: 判断数组是否包含某个key

    [root@demo-c8 opt]# awk 'BEGIN{weekdays["mon"]="monday";weekdays["tue"]="tuesday";weekdays["wed"]="wednesday";print "thur" in weekdays; print "mon" in weekdays}'
    0
    1
    

    若要遍历数组中的每个value, 可以使用for循环. for循环每次遍历时, 会取出数组的key.

    范例:

    [root@demo-c8 opt]# awk 'BEGIN{user["name"]="root"; user["uid"]="0"; user["password"]="Y"; for(i in user){print user[i]}}'
    Y
    root
    0
    

    范例: 显示主机的连接状态出现的次数, 并排序

    • ss命令输出格式
    [root@demo-c8 opt]# ss -nta | awk 'NR!=1{print $1}' | sort | uniq -c
          1 ESTAB
          7 LISTEN
    
    • state[$1]++: 创建一个awk数组, 名字为state. 按照$1, 把不同的状态, 作为数组的key, ++表示分别按照不同的状态, 进行次数累计. 最后得到的数组就是每个状态对应的次数
    [root@demo-c8 opt]# ss -ant | awk 'NR!=1{state[$1]++}END{for(i in state){print i,state[i]}}'
    LISTEN 7
    ESTAB 1
    

    范例: 统计访问网站的ip的次数

    [root@demo-c8 opt]# cat access_log | head -n1
    172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET / HTTP/1.1" 200 912 "-" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
    
    [root@demo-c8 opt]# awk '{ip[$1]++}END{for(i in ip){printf "%-20s %s\n", ip[i],  i}}' access_log  | sort -nr | head -n5
    4870                 172.20.116.228
    3429                 172.20.116.208
    2834                 172.20.0.222
    2613                 172.20.112.14
    2267                 172.20.0.227
    

    范例: 利用/etc/hosts.deny文件拒绝其他服务器的SSH访问

    • /etc/hosts.deny文件利用的是tcpwrapper技术, 但是仅在CentOS7和较老版本生效, CentOS8开始不支持
    [20:10:41 root@centos-7-6 ~]#echo "sshd: 10.0.0.108" >> /etc/hosts.deny 
    
    [root@demo-c8 opt]# ssh root@10.0.0.187
    kex_exchange_identification: read: Connection reset by peer
    

    4. awk函数

    awk 的函数分为内置和自定义函数

    4.1 常见内置函数

    数值处理:

    • rand():返回0和1之间一个随机数
    • srand():配合rand() 函数,生成随机数的种子
    • int():返回整数部分

    范例: rand()和srand()要配套使用, 否则rand()只会返回固定数值. 并且, 此命令不能执行过快, 否则只会返回相同的值

    [22:42:27 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
    0.482749
    [22:42:40 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
    0.12631
    [22:42:42 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
    0.12631
    [22:42:42 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
    0.12631
    [22:42:42 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
    0.0949145
    [22:42:44 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
    0.0949145
    [22:42:44 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
    0.25884
    

    范例: 打印100以内整数十次

    [22:45:52 root@centos7 ~]#awk 'BEGIN{srand();for(i=1;i<=10;i++)print int(rand()*100)}'
    13
    5
    63
    15
    36
    82
    91
    33
    86
    31
    

    字符串处理:

    • length([s]):返回指定字符串的长度
    • sub(r,s,[t]):对t字符串搜索r表示模式匹配的内容,并将第一个匹配内容替换为s

    范例: sub字符替换

    [22:46:11 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk 'sub(/:/,"-",$1)'
    2008-08:08 08:08:08
    
    [22:47:26 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk '{sub(/:/,"-",$1);print $0}'
    2008-08:08 08:08:08
    
    • gsub(r,s,[t]):对t字符串进行搜索r表示的模式匹配的内容,并全部替换为s所表示的内容

    范例:

    [22:48:10 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk 'gsub(/:/,"-",$0)'
    2008-08-08 08-08-08
    [22:49:00 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk '{gsub(/:/,"-",$0);print $0}'
    2008-08-08 08-08-08
    

    split(s,array,[r]):以r为分隔符,切割字符串s,并将切割后的结果保存至array所表示的数组中,第一个索引值为1,第二个索引值为2,…

    范例:

    [22:49:47 root@centos7 ~]#netstat -nt | awk '/^tcp/{split($5,ip,":");count[ip[1]]++}END{for(i in count){print i,count[i]}}'
    192.168.192.1 1
    

    system 函数:可以awk中调用shell命令

    空格是awk中的字符串连接符,如果system中需要使用awk中的变量可以使用空格分隔,或者说除了awk的变量外其他一律用""引用起来

    [22:50:43 root@centos7 ~]#awk 'BEGIN{system("hostname")}'
    centos7.mac
    [22:50:48 root@centos7 ~]#awk 'BEGIN{score=100; system("echo your score is " score) }'
    your score is 100
    

    范例: 统计连接次数大于等于10的ip, 并且添加到iptables

    [root@centos8 ~]#netstat -tn | awk 
    '/^tcp/{split($5,ip,":");count[ip[1]]++}END{for(i in count){if(count[i]>=10)
    {system("iptables -A INPUT -s "i" -j REJECT")}}}'
    

    4.2 自定义函数

    自定义函数格式:

    function name ( parameter, parameter, ... ) {
       statements
       return expression
    }
    

    范例: 统计两个数谁大谁小

    [22:53:20 root@centos7 ~]#vim func.awk
    
    function max(x,y) {                                                             
     x>y?var=x:var=y
     return var
    }
    BEGIN{print max(a,b)}
    
    [22:53:52 root@centos7 ~]#awk -v a=30 -v b=20 -f func.awk
    30
    

    5. awk脚本

    • 将awk程序写成脚本,直接调用或执行

    范例:

    [22:56:52 root@centos7 ~]#vim passwd.awk
    
    {if($3<=1000)print $1,$3}
    
    [22:57:20 root@centos7 ~]#awk -F: -f passwd.awk /etc/passwd
    root 0
    bin 1
    daemon 2
    adm 3
    lp 4
    sync 5
    shutdown 6
    halt 7
    mail 8
    operator 11
    games 12
    ftp 14
    nobody 99
    systemd-network 192
    dbus 81
    polkitd 999
    sshd 74
    postfix 89
    tcpdump 72
    

    范例:

    [22:57:21 root@centos7 ~]#vim test.awk
    
    #!/bin/awk -f
    #this is a awk script
    {if($3<=1000)print $1,$3}  
    
    [22:58:45 root@centos7 ~]#./test.awk -F: /etc/passwd
    root 0
    bin 1
    daemon 2
    adm 3
    lp 4
    sync 5
    shutdown 6
    halt 7
    mail 8
    operator 11
    games 12
    ftp 14
    nobody 99
    systemd-network 192
    dbus 81
    polkitd 999
    sshd 74
    postfix 89
    tcpdump 72
    
    • 向awk脚本传递参数

    格式:

    awkfile var=value var2=value2... Inputfile
    

    注意:在BEGIN过程中不可用。直到首行输入完成以后,变量才可用。可以通过-v 参数,让awk在执行BEGIN之前得到变量的值。命令行中每一个指定的变量都需要一个-v参数

    范例:

    [22:58:46 root@centos7 ~]#vim test2.awk
    #!/bin/awk -f
    {if($3 >=min && $3<=max)print $1,$3}    
    
    [23:00:17 root@centos7 ~]#chmod +x test2.awk
    [23:00:18 root@centos7 ~]#./test2.awk -F: min=100 max=200 /etc/passwd
    systemd-network 192
    

    练习:

    1、文件host_list.log 如下格式,请提取”.magedu.com”前面的主机名部分并写入到回到该文件中

    1 www.magedu.com
    2 blog.magedu.com
    3 study.magedu.com
    4 linux.magedu.com
    5 python.magedu.com
    ......
    999 study.magedu.com
    

    2、统计/etc/fstab文件中每个文件系统类型出现的次数
    3、统计/etc/fstab文件中每个单词出现的次数
    4、提取出字符串Yd$C@M05MB%9&Bdh7dq+YVixp3vpw中的所有数字
    5、有一文件记录了1-100000之间随机的整数共5000个,存储的格式100,50,35,89…请取出其中最大和最小的整数
    6、解决Dos攻击生产案例:根据web日志或者或者网络连接数,监控当某个IP并发连接数或者短时内PV达到100,即调用防火墙命令封掉对应的IP,监控频率每隔5分钟。防火墙命令为:iptables -A INPUT -s IP -j REJECT
    7、将以下文件内容中FQDN取出并根据其进行计数从高到低排序

    http://mail.magedu.com/index.html
    http://www.magedu.com/test.html
    http://study.magedu.com/index.html
    http://blog.magedu.com/index.html
    http://www.magedu.com/images/logo.jpg
    http://blog.magedu.com/20080102.html
    http://www.magedu.com/images/magedu.jpg
    

    参考答案:

    [root@centos8 ~]#awk -F"/" '{url[$3]++}END{for(i in url){print url[i],i}}' 
    url.log |sort -nr
    3 www.magedu.com
    2 blog.magedu.com
    1 study.magedu.com
    1 mail.magedu.com
    

    8、将以下文本以inode为标记,对inode相同的counts进行累加,并且统计出同一inode中,beginnumber的最小值和endnumber的最大值

    inode|beginnumber|endnumber|counts|
    106|3363120000|3363129999|10000|
    106|3368560000|3368579999|20000|
    310|3337000000|3337000100|101|
    310|3342950000|3342959999|10000|
    310|3362120960|3362120961|2|
    311|3313460102|3313469999|9898|
    311|3313470000|3313499999|30000|
    311|3362120962|3362120963|2|
    

    输出的结果格式为:

    310|3337000000|3362120961|10103|
    311|3313460102|3362120963|39900|
    106|3363120000|3368579999|30000|
    

    相关文章

      网友评论

        本文标题:22. 三剑客之awk

        本文链接:https://www.haomeiwen.com/subject/ekwjortx.html