1. AWK基础
1.1 AWK工作原理和基本用法说明
AWK:Aho, Weinberger, Kernighan,报告生成器,格式化文本输出,GNU/Linux发布的AWK目前由自由软件基金会(FSF)进行开发和维护,通常也称它为 GNU AWK
有多种版本:
- AWK:原先来源于 AT & T 实验室的的AWK
- NAWK:New awk,AT & T 实验室的AWK的升级版
- GAWK:即GNU AWK. 所有的GNU/Linux发布版都自带GAWK,它与AWK和NAWK完全兼容
gawk:模式扫描和处理语言,可以实现下面功能
- 文本处理
- 输出格式化的文本报表
- 执行算数运算
- 执行字符串操作
格式:
awk [options] 'program' var=value file…
awk [options] -f programfile var=value file…
说明:
program通常是被放在单引号中,并可以由三种部分组成
- BEGIN语句块
- 模式匹配的通用语句块
- END语句块
格式:
awk 选项 PATTERN'BEGIN{BEGIN ACTION}{文本处理 ACTION}ENG{END ACTION}' 文件路径
常见选项:
- -F “分隔符” 指明输入时用到的字段分隔符,默认的分隔符是若干个连续空白符
- -v var=value 变量赋值; 即可定义内置变量, 也可定义自定义变量
Program格式:
pattern{action statements;..}
pattern:决定动作语句何时触发及触发事件,比如:BEGIN,END,正则表达式等
如果省略了pattern, 那么就是对所有行做处理
action statements:对数据进行处理,放在{}内指明,常见:print, printf
如果省略了action, 那么就是对所有列做处理
范例: 省略pattern和action. 如果省略了action, 那么program内的关系表达式必须返回真(非0值, 非空字符串). 否则不会对文本处理
[root@demo-c8 ~]# awk '' /etc/fstab
[root@demo-c8 ~]# awk '0' /etc/fstab
[root@demo-c8 ~]# awk '1' /etc/fstab
#
# /etc/fstab
# Created by anaconda on Mon Aug 15 16:52:19 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
UUID=b1ab1ace-2582-4afd-8693-39bd9855041c / xfs defaults 0 0
UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot ext4 defaults 1 2
UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data xfs defaults 0 0
UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap swap defaults 0 0
[root@demo-c8 ~]# awk '"hello"' /etc/fstab
#
# /etc/fstab
# Created by anaconda on Mon Aug 15 16:52:19 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
UUID=b1ab1ace-2582-4afd-8693-39bd9855041c / xfs defaults 0 0
UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot ext4 defaults 1 2
UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data xfs defaults 0 0
UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap swap defaults 0 0
awk工作过程:
第一步:执行BEGIN{action;… }语句块中的语句
第二步:从文件或标准输入(stdin)读取一行,然后执行pattern{ action;… }语句块,逐行扫描文件,从第一行到最后一行重复这个过程,直到文件全部被读取完毕
第三步:当读至输入流末尾时,执行END{action;…}语句块
BEGIN语句块在awk开始从输入流中读取行之前被执行,这是一个可选的语句块,比如变量初始化、打印输出表格的表头等语句通常可以写在BEGIN语句块中
END语句块在awk从输入流中读取完所有的行之后即被执行,比如打印所有行的分析结果这类信息汇总都是在END语句块中完成,它也是一个可选语句块
pattern语句块中的通用命令是最重要的部分,也是可选的. 如果没有提供pattern语句块,则默认执行{ print },即打印每一个读取到的行,awk读取的每一行都会执行该语句块
分隔符, 域和记录:
awk会把读入的文件或者标准输入, 当做一个表格格式来处理. 默认按照\n
来区分两行, 当然也可以自定义如何划分不同的行. 比如: 自定义;
为分隔符, 那么;
前面的为一行, ;
后面的为一行
- 由分隔符分隔的字段(列column,域field)标记
$1,$2...$n
称为域标识,$0
为所有域,注意:和Shell中变量$
符含义不同
$1: 第一列
$2: 第二列
...
$0: 所有列
- 文件的每一行称为记录record
- 如果省略action,则默认执行 print $0 的操作, 也就是对所有列, 做处理
常用的action分类:
- output statements:print,printf
- Expressions:算术,比较表达式等
- Compound statements:组合语句
- Control statements:if, while等
- input statements
awk控制语句:
- { statements;… } 组合语句
- if(condition) {statements;…}
- if(condition) {statements;…} else {statements;…}
- while(conditon) {statments;…}
- do {statements;…} while(condition)
- for(expr1;expr2;expr3) {statements;…}
- break
- continue
- exit
1.2 动作print
格式:
print item1, item2, ...
说明:
- 逗号分隔符
- 输出item可以是字符串,也可是数值;当前记录的字段、变量或awk的表达式
- 如果省略item,相当于print $0
- 固定字符需要用""引起来,而变量和数字不需要
abc: 变量
"abc": 纯字符串
范例: print默认会对传给awk的标准输入做打印, 也就是打印整行, $0
root@u18:~# awk '{print}'
aa
aa
bb
bb
cc
cc
root@u18:~# cat /etc/fstab | awk '{print}'
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda1 during installation
UUID=f906b5aa-3e5b-4e12-8f11-55f55c41e1b0 / ext4 errors=remount-ro 0 1
# /boot was on /dev/sda2 during installation
UUID=24239793-a342-4d7f-8773-e7381727a5dd /boot ext4 defaults 0 2
# /data was on /dev/sda4 during installation
UUID=a328accf-9575-4343-976b-751c27cdb8ec /data ext4 defaults 0 2
# swap was on /dev/sda5 during installation
UUID=0f94202e-4796-4835-b329-75425a807dcd none swap sw 0 0
root@u18:~# awk '{print}' < /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda1 during installation
UUID=f906b5aa-3e5b-4e12-8f11-55f55c41e1b0 / ext4 errors=remount-ro 0 1
# /boot was on /dev/sda2 during installation
UUID=24239793-a342-4d7f-8773-e7381727a5dd /boot ext4 defaults 0 2
# /data was on /dev/sda4 during installation
UUID=a328accf-9575-4343-976b-751c27cdb8ec /data ext4 defaults 0 2
# swap was on /dev/sda5 during installation
UUID=0f94202e-4796-4835-b329-75425a807dcd none swap sw 0 0
范例: awk可以直接打印文件的全部内容
root@u18:~# awk '{print}' /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda1 during installation
UUID=f906b5aa-3e5b-4e12-8f11-55f55c41e1b0 / ext4 errors=remount-ro 0 1
# /boot was on /dev/sda2 during installation
UUID=24239793-a342-4d7f-8773-e7381727a5dd /boot ext4 defaults 0 2
# /data was on /dev/sda4 during installation
UUID=a328accf-9575-4343-976b-751c27cdb8ec /data ext4 defaults 0 2
# swap was on /dev/sda5 during installation
UUID=0f94202e-4796-4835-b329-75425a807dcd none swap sw 0 0
范例: 打印固定字符串. print接固定字符串, 就是打印固定内容
root@u18:~# awk '{print "hello awk"}'
aaa
hello awk
vvv
hello awk
ccc
hello awk
# seq 10, 表示awk打印print的固定字符串10次
root@u18:~# seq 10 | awk '{print "hello awk"}'
hello awk
hello awk
hello awk
hello awk
hello awk
hello awk
hello awk
hello awk
hello awk
hello awk
范例: 域分隔符, 默认为连续的空格. 分隔时默认会把多个连续的空格, 压缩成一个
root@u18:~# df | awk '{print $5}'
Use%
0%
1%
3%
0%
0%
0%
9%
1%
0%
root@u18:~# df | awk '{print $5 }' | awk -F'%' '{print $1}'
Use
0
1
3
0
0
0
9
1
0
范例: 自定义域分隔符, -F
选项
root@u18:~# awk -F":" '{print $1,$3}' /etc/passwd
root 0
daemon 1
bin 2
sys 3
sync 4
games 5
...
范例: 文本分隔后, 默认会用空格作为列的分隔符. 也可以指定新的分隔符
# 新的分隔符在print里, 必须用双引号括起来
root@u18:~# awk -F':' '{print $1":"$3}' /etc/passwd
root:0
daemon:1
bin:2
sys:3
sync:4
games:5
...
范例: 指定table键为输出时的分隔符
root@u18:~# awk -F: '{print $1"\t"$3}' /etc/passwd
root 0
daemon 1
bin 2
sys 3
sync 4
games 5
man 6
lp 7
mail 8
news 9
...
范例: 统计一个网站访问量最大的前5个ip
root@u18:~# sed -nr '1,5p' access_log
172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET / HTTP/1.1" 200 912 "-" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
172.18.118.91 - - [20/May/2018:08:09:59 +0800] "POST /webnoauth/model.cgi HTTP/1.1" 404 293 "http://172.18.0.1/webnoauth/model.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET /router/get_rand_key.cgi HTTP/1.1" 404 297 "http://172.18.0.1/router/get_rand_key.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET /router/get_rand_key.cgi HTTP/1.1" 404 297 "http://172.18.0.1/router/get_rand_key.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET /router/get_rand_key.cgi HTTP/1.1" 404 297 "http://172.18.0.1/router/get_rand_key.cgi" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
root@u18:~# awk '{print $1}' access_log | sort | uniq -c | sort -k1 -nr | head -n5
4870 172.20.116.228
3429 172.20.116.208
2834 172.20.0.222
2613 172.20.112.14
2267 172.20.0.227
范例: 分隔文本时, 可以指定多个分隔符. 此时, 凡是被指定分隔符隔开的, 都是单独的一列. 这样可以避免因为分隔符不同, 需要多次对文本进行处理的情况
# [[:space:]]+|%: 扩展正则表达式, 表示一个以上包括一个空格, 或者%都作为分隔符
root@u18:~# df | awk -F"[[:space:]]+|%" '{print $5}'
Use
0
1
3
0
0
0
9
1
0
范例: 文件host_list.log 如下格式,请提取”.xxx.com”前面的主机名部分并写回到该文件中
1 www.xxx.com
2 blog.xxx.com
3 study.xxx.com
4 linux.xxx.com
5 python.xxx.com
root@u18:/opt# awk -F"[ .]" '{print $2}' host_list.org
www
blog
study
linux
python
root@u18:/opt# awk -F"[ .]" '{print $2}' host_list.org >> host_list.org
root@u18:/opt# cat host_list.org
1 www.xxx.com
2 blog.xxx.com
3 study.xxx.com
4 linux.xxx.com
5 python.xxx.com
www
blog
study
linux
python
1.3 AWK变量
awk中的变量分为:内置和自定义变量
变量定义格式:
-v var=value 变量赋值; 即可定义内置变量, 也可定义自定义变量
awk变量的引用格式:
-v FS=":"
引用awk变量不用写$, 直接写FS即可
常见的内置变量:
- FS(Field Separater): 输入字段分隔符,默认为空格,功能相当于 -F. 但是-F 和 FS变量功能一样,同时使用会冲突
范例: 指定FS为":"
root@u18:/opt# awk -v FS=":" '{print $1,$2}' /etc/passwd
root x
daemon x
bin x
sys x
sync x
games x
...
范例: 输出分隔符变量也可引用输入分隔符变量
root@u18:/opt# awk -v FS=":" '{print $1FS$2}' /etc/passwd
root:x
daemon:x
bin:x
sys:x
sync:x
games:x
...
范例: awk也可以引用Shell的变量
root@u18:/opt# var=":"; awk -v FS=$var '{print $1FS$2}' /etc/passwd
root:x
daemon:x
bin:x
sys:x
sync:x
games:x
...
- OFS: 输出字段分隔符,默认为空格
范例: 指定OFS为"===="
root@u18:/opt# awk -F":" -v OFS="======" '{print $1,$2}' /etc/passwd
root======x
daemon======x
bin======x
sys======x
sync======x
games======x
...
root@u18:/opt# awk -v FS=":" -v OFS="======" '{print $1, $2}' /etc/passwd
root======x
daemon======x
bin======x
sys======x
...
- RS:输入记录record分隔符,指定输入时的换行符. 默认为\n
范例: 指定";"
为record分隔符
root@u18:/opt# vim f1.txt
a,b,c;11,22
33,44;xx,yy,zz
m,n;xxx
# 指定分号为record分隔符, 那么a b c是一行, 11 22 33 44是一行, xx yy zz m n是一行, xxx是一行
# 但是因为22, zz, xxx后本身就有换行符, 会继续保留, 所以才又换一次行
root@u18:/opt# awk -v RS=";" '{print $0}' f1.txt
a,b,c
11,22 # 文本中, 22后面有换行符, 会继续保留
33,44
xx,yy,zz
m,n
xxx # xxx后有换行符
root@u18:/opt#
- NR:打印record的编号
范例: NR变量可以显示record编号, 用于区分awk的record和Shell本身的换行
root@u18:/opt# awk -v RS=";" '{print NR,$0}' f1.txt
1 a,b,c # record1
2 11,22 # record 2
33,44
3 xx,yy,zz # record3
m,n
4 xxx # record4
root@u18:/opt#
- ORS:输出记录分隔符,输出时用指定符号代替换行符. 默认会用换行符
范例: 指定"+++"
为输出时record换行符
root@u18:/opt# awk -v RS=";" -v ORS="+++" '{print NR,$0}' f1.txt
1 a,b,c+++2 11,22
33,44+++3 xx,yy,zz
m,n+++4 xxx
+++root@u18:/opt#
- NF:字段数量
范例: NF显示的是, 根据输入列分隔符分隔后, 每一行有多少个字段. 所以一共有几行, 就会返回几行
root@u18:/opt# awk -F":" '{print NF}' /etc/passwd
root@u18:/opt# awk -F":" '{print NF}' /etc/passwd
7
7
7
7
7
7
7
7
7
7
7
7
7
...
$NF: 代表最后一个字段
root@u18:/opt# awk -F":" '{print $NF}' /etc/passwd
/bin/bash
/usr/sbin/nologin
/usr/sbin/nologin
/usr/sbin/nologin
...
$(NF-1): 代表倒数第二个字段
root@u18:/opt# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
...
root@u18:/opt# awk -F":" '{print $(NF-1)}' /etc/passwd
/root
/usr/sbin
/bin
/dev
/bin
/usr/games
/var/cache/man
/var/spool/lpd
/var/mail
/var/spool/news
/var/spool/uucp
/bin
...
范例: 取源码包的版本信息
root@u18:/opt# tar cvf app.v1.tar.gz /etc
root@u18:/opt# tar cvf app.v2.tar.gz /etc
root@u18:/opt# tar cvf app.v3.tar.gz /etc
root@u18:/opt# tar cvf app.v4.tar.gz /etc
root@u18:/opt# tar cvf app.v5.tar.gz /etc
root@u18:/opt# ll app*
-rw-r--r-- 1 root root 3225600 Sep 17 22:06 app.v1.tar.gz
-rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v2.tar.gz
-rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v3.tar.gz
-rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v4.tar.gz
-rw-r--r-- 1 root root 3225600 Sep 17 22:09 app.v5.tar.gz
# 以"."为输入分隔符, 取倒数第三个字段, 即为版本号
root@u18:/opt# ls app.* | xargs -n1 | awk -F"." '{print $(NF-2)}'
v1
v2
v3
v4
v5
# 这里其实不用xargs进行转置, 因为ls的结果虽然是以横行显示, 但本身是竖着的
root@u18:/opt# ls app.* | awk -F"." '{print $(NF-2)}'
v1
v2
v3
v4
v5
- FNR:各文件分别计数记录的编号. 默认情况下, 多个文件会统一编号
范例: 利用FNR分别统计各文件记录的编号
root@u18:/opt# awk -F":" '{print FNR,$1}' /etc/passwd /etc/group
1 root
2 daemon
3 bin
4 sys
5 sync
6 games
7 man
8 lp
9 mail
10 news
11 uucp
12 proxy
13 www-data
14 backup
15 list
16 irc
17 gnats
18 nobody
19 systemd-network
20 systemd-resolve
21 syslog
22 messagebus
23 _apt
24 lxd
25 uuidd
26 dnsmasq
27 landscape
28 sshd
29 pollinate
30 david
1 root
2 daemon
3 bin
4 sys
5 adm
6 tty
7 disk
8 lp
9 mail
10 news
11 uucp
12 man
13 proxy
14 kmem
15 dialout
16 fax
17 voice
18 cdrom
19 floppy
20 tape
21 sudo
22 audio
23 dip
24 www-data
25 backup
26 operator
27 list
28 irc
29 src
30 gnats
31 shadow
32 utmp
33 video
34 sasl
35 plugdev
36 staff
37 games
38 users
39 nogroup
40 systemd-journal
41 systemd-network
42 systemd-resolve
43 input
44 crontab
45 syslog
46 messagebus
47 lxd
48 mlocate
49 uuidd
50 ssh
51 landscape
52 david
53 lpadmin
54 sambashare
- FILENAME:返回当前文件名, 配合FNR使用, 返回的record编号会标记属于哪个文件
root@u18:/opt# awk -F":" '{print FNR,$1,FILENAME}' /etc/passwd /etc/group
1 root /etc/passwd
2 daemon /etc/passwd
3 bin /etc/passwd
4 sys /etc/passwd
5 sync /etc/passwd
6 games /etc/passwd
7 man /etc/passwd
8 lp /etc/passwd
9 mail /etc/passwd
10 news /etc/passwd
11 uucp /etc/passwd
12 proxy /etc/passwd
13 www-data /etc/passwd
14 backup /etc/passwd
15 list /etc/passwd
16 irc /etc/passwd
17 gnats /etc/passwd
18 nobody /etc/passwd
19 systemd-network /etc/passwd
20 systemd-resolve /etc/passwd
21 syslog /etc/passwd
22 messagebus /etc/passwd
23 _apt /etc/passwd
24 lxd /etc/passwd
25 uuidd /etc/passwd
26 dnsmasq /etc/passwd
27 landscape /etc/passwd
28 sshd /etc/passwd
29 pollinate /etc/passwd
30 david /etc/passwd
1 root /etc/group
2 daemon /etc/group
3 bin /etc/group
4 sys /etc/group
5 adm /etc/group
6 tty /etc/group
7 disk /etc/group
8 lp /etc/group
9 mail /etc/group
10 news /etc/group
11 uucp /etc/group
12 man /etc/group
13 proxy /etc/group
14 kmem /etc/group
15 dialout /etc/group
16 fax /etc/group
17 voice /etc/group
18 cdrom /etc/group
19 floppy /etc/group
20 tape /etc/group
21 sudo /etc/group
22 audio /etc/group
23 dip /etc/group
24 www-data /etc/group
25 backup /etc/group
26 operator /etc/group
27 list /etc/group
28 irc /etc/group
29 src /etc/group
30 gnats /etc/group
31 shadow /etc/group
32 utmp /etc/group
33 video /etc/group
34 sasl /etc/group
35 plugdev /etc/group
36 staff /etc/group
37 games /etc/group
38 users /etc/group
39 nogroup /etc/group
40 systemd-journal /etc/group
41 systemd-network /etc/group
42 systemd-resolve /etc/group
43 input /etc/group
44 crontab /etc/group
45 syslog /etc/group
46 messagebus /etc/group
47 lxd /etc/group
48 mlocate /etc/group
49 uuidd /etc/group
50 ssh /etc/group
51 landscape /etc/group
52 david /etc/group
53 lpadmin /etc/group
54 sambashare /etc/group
- ARGC:返回awk命令行参数的个数
范例: ARGV返回awk命令参数个数, '为第一个', /etc/passwd为第二个, /etc/group为第三个
root@u18:/opt# awk -F":" '{print ARGC,FNR,$1,FILENAME}' /etc/passwd /etc/group
3 1 root /etc/passwd
3 2 daemon /etc/passwd
3 3 bin /etc/passwd
3 4 sys /etc/passwd
3 5 sync /etc/passwd
3 6 games /etc/passwd
3 7 man /etc/passwd
3 8 lp /etc/passwd
3 9 mail /etc/passwd
3 10 news /etc/passwd
3 11 uucp /etc/passwd
3 12 proxy /etc/passwd
3 13 www-data /etc/passwd
3 14 backup /etc/passwd
3 15 list /etc/passwd
3 16 irc /etc/passwd
3 17 gnats /etc/passwd
3 18 nobody /etc/passwd
3 19 systemd-network /etc/passwd
3 20 systemd-resolve /etc/passwd
3 21 syslog /etc/passwd
3 22 messagebus /etc/passwd
3 23 _apt /etc/passwd
3 24 lxd /etc/passwd
3 25 uuidd /etc/passwd
3 26 dnsmasq /etc/passwd
3 27 landscape /etc/passwd
3 28 sshd /etc/passwd
3 29 pollinate /etc/passwd
3 30 david /etc/passwd
3 1 root /etc/group
3 2 daemon /etc/group
3 3 bin /etc/group
3 4 sys /etc/group
3 5 adm /etc/group
3 6 tty /etc/group
3 7 disk /etc/group
3 8 lp /etc/group
3 9 mail /etc/group
3 10 news /etc/group
3 11 uucp /etc/group
3 12 man /etc/group
3 13 proxy /etc/group
3 14 kmem /etc/group
3 15 dialout /etc/group
3 16 fax /etc/group
3 17 voice /etc/group
3 18 cdrom /etc/group
3 19 floppy /etc/group
3 20 tape /etc/group
3 21 sudo /etc/group
3 22 audio /etc/group
3 23 dip /etc/group
3 24 www-data /etc/group
3 25 backup /etc/group
3 26 operator /etc/group
3 27 list /etc/group
3 28 irc /etc/group
3 29 src /etc/group
3 30 gnats /etc/group
3 31 shadow /etc/group
3 32 utmp /etc/group
3 33 video /etc/group
3 34 sasl /etc/group
3 35 plugdev /etc/group
3 36 staff /etc/group
3 37 games /etc/group
3 38 users /etc/group
3 39 nogroup /etc/group
3 40 systemd-journal /etc/group
3 41 systemd-network /etc/group
3 42 systemd-resolve /etc/group
3 43 input /etc/group
3 44 crontab /etc/group
3 45 syslog /etc/group
3 46 messagebus /etc/group
3 47 lxd /etc/group
3 48 mlocate /etc/group
3 49 uuidd /etc/group
3 50 ssh /etc/group
3 51 landscape /etc/group
3 52 david /etc/group
3 53 lpadmin /etc/group
3 54 sambashare /etc/group
- ARGV:数组,保存的是命令行所给定的各参数,返回某一个参数用:ARGV[0],....... ARGV[0]返回awk命令本身, 超出参数个数, 返回空白
范例: ARGV返回第n个参数
root@u18:/opt# awk -F":" '{print ARGV[0]}' /etc/passwd /etc/group
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
awk
root@u18:/opt# awk -F":" '{print ARGV[1]}' /etc/passwd /etc/group
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
/etc/passwd
root@u18:/opt# awk -F":" '{print ARGV[2]}' /etc/passwd /etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
/etc/group
自定义变量(区分字符大小写):
-v var=value
在program中直接定义
范例: 在处理文件之前, 先打印一次变量, 用于制作表头等情况. BEGIN{这里的内容, 会在处理文本之前先执行一次}
root@u18:/opt# awk 'BEGIN{test="hello,world"; print test}'
hello,world
root@u18:/opt# awk -v test="hello,awk" 'BEGIN{print test}'
hello,awk
root@u18:/opt# awk -v test="hello,awk" 'BEGIN{print test; test="hello,world";print test}'
hello,awk
hello,world
范例: 在awk的program中, 可以使用链式赋值, 但是在awk外部不支持
root@u18:/opt# awk 'BEGIN{test1=test2="hello,world"; print test1; print test2; print "hello,awk"}'
hello,world
hello,world
hello,awk
范例: awk打印多个字符串, 可以只写一个print命令
root@u18:/opt# awk 'BEGIN{test1=test2="hello,world"; print test1,test2,"hello,awk"}'
hello,world hello,world hello,awk
范例: awk的变量不支持先引用, 后赋值
root@u18:/opt# awk -F":" '{sex="male";print $1,sex,age;age=18}' /etc/passwd
root male # 第一行记录的age为空, 说明处理第一行时, 运行到print age发现age没赋值, 所以返回空
daemon male 18 # 第二行开始有age的值, 因为第一行处理完, age=18就执行了, 所以第二行开始都是有age值的
bin male 18
sys male 18
sync male 18
games male 18
man male 18
lp male 18
mail male 18
news male 18
uucp male 18
proxy male 18
www-data male 18
backup male 18
list male 18
irc male 18
gnats male 18
nobody male 18
systemd-network male 18
systemd-resolve male 18
syslog male 18
messagebus male 18
_apt male 18
lxd male 18
uuidd male 18
dnsmasq male 18
landscape male 18
sshd male 18
pollinate male 18
david male 18
范例: 可以把awk的program部分写到文件里, 之后在命令中通过-f
选项调用
root@u18:/opt# vim awk.txt
{print $(NF-1)}
root@u18:/opt# awk -F":" -f awk.txt /etc/passwd
/root
/usr/sbin
/bin
/dev
/bin
/usr/games
/var/cache/man
/var/spool/lpd
/var/mail
/var/spool/news
/var/spool/uucp
/bin
/var/www
/var/backups
/var/list
/var/run/ircd
/var/lib/gnats
/nonexistent
/run/systemd/netif
/run/systemd/resolve
/home/syslog
/nonexistent
/nonexistent
/var/lib/lxd/
/run/uuidd
/var/lib/misc
/var/lib/landscape
/run/sshd
/var/cache/pollinate
/home/david
1.4 动作printf
printf可以实现格式化输出
格式:
# 在'program中书写'
printf "FORMAT", item1, item2,...
说明:
- 必须指定FORMAT
- 不会自动换行, 需要显示给出换行控制符\n
- FORMAT中需要分别为后面每个item指定格式符
格式符: 与item一一对应
%c:显示字符的ASCII码
%d, %i:显示十进制整数
%e, %E:显示科学计数法数值
%f:显示为浮点数
%g, %G:以科学计数法或浮点形式显示数值
%s:显示字符串
%u:无符号整数
%%:显示%自身
修饰符:
#[.#] 第一个数字控制显示的宽度;第二个#表示小数点后精度,如:%3.1f
- 左对齐(默认右对齐) 如:%-15s
+ 显示数值的正负符号 如:%+d
范例:
awk -F: '{printf "%s",$1}' /etc/passwd
awk -F: '{printf "%s\n",$1}' /etc/passwd
awk -F: '{printf "%20s\n",$1}' /etc/passwd
awk -F: '{printf "%-20s\n",$1}' /etc/passwd
awk -F: '{printf "%-20s %10d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %s\n",$1}' /etc/passwd
awk -F: '{printf “Username: %sUID:%d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %25sUID:%d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %-25sUID:%d\n",$1,$3}' /etc/passwd
-
%s
指代的就是$1
的内容
[19:20:08 root@centos7 ~]#awk -F: '{printf "%s", $1}' /etc/passwd
rootbindaemonadmlpsyncshutdownhaltmailoperatorgamesftpnobodysystemd-networkdbuspolkitdsshdpostfixtcpdump[19:20:38 root@centos7 ~]#
- 换行符需要显示指定\n
[19:20:03 root@centos7 ~]#awk -F: '{printf "%s\n", $1}' /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown
halt
mail
operator
games
ftp
nobody
systemd-network
dbus
polkitd
sshd
postfix
tcpdump
[19:21:16 root@centos7 ~]#awk -F":" '{printf "%20s\n",$1}' /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown
halt
mail
operator
games
ftp
nobody
systemd-network
dbus
polkitd
sshd
postfix
tcpdump
[19:22:15 root@centos7 ~]#awk -F":" '{printf "%-20s\n",$1}' /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown
halt
mail
operator
games
ftp
nobody
systemd-network
dbus
polkitd
sshd
postfix
tcpdump
[19:22:58 root@centos7 ~]#awk -F":" '{printf "Username: %s\n",$1}' /etc/passwd
Username: root
Username: bin
Username: daemon
Username: adm
Username: lp
Username: sync
Username: shutdown
Username: halt
Username: mail
Username: operator
Username: games
Username: ftp
Username: nobody
Username: systemd-network
Username: dbus
Username: polkitd
Username: sshd
Username: postfix
Username: tcpdump
- 打印多列, 需要分别指定格式, 并且指定列的输出分隔符, 否则两列会连在一起
[19:25:32 root@centos7 ~]#awk -F":" '{printf "Username: %s | UID: %d\n", $1, $3}' /etc/passwd
Username: root | UID: 0
Username: bin | UID: 1
Username: daemon | UID: 2
Username: adm | UID: 3
Username: lp | UID: 4
Username: sync | UID: 5
Username: shutdown | UID: 6
Username: halt | UID: 7
Username: mail | UID: 8
Username: operator | UID: 11
Username: games | UID: 12
Username: ftp | UID: 14
Username: nobody | UID: 99
Username: systemd-network | UID: 192
Username: dbus | UID: 81
Username: polkitd | UID: 999
Username: sshd | UID: 74
Username: postfix | UID: 89
Username: tcpdump | UID: 72
[19:26:34 root@centos7 ~]#awk -F":" '{printf "Username: %-25s UID: %d\n", $1, $3}' /etc/passwd
Username: root UID: 0
Username: bin UID: 1
Username: daemon UID: 2
Username: adm UID: 3
Username: lp UID: 4
Username: sync UID: 5
Username: shutdown UID: 6
Username: halt UID: 7
Username: mail UID: 8
Username: operator UID: 11
Username: games UID: 12
Username: ftp UID: 14
Username: nobody UID: 99
Username: systemd-network UID: 192
Username: dbus UID: 81
Username: polkitd UID: 999
Username: sshd UID: 74
Username: postfix UID: 89
Username: tcpdump UID: 72
[root@demo-c8 ~]# awk -F":" '{printf "%-20s %s\n", $1,$3 }' /etc/passwd
root 0
bin 1
daemon 2
adm 3
lp 4
sync 5
shutdown 6
halt 7
mail 8
operator 11
games 12
ftp 14
nobody 65534
dbus 81
systemd-coredump 999
systemd-resolve 193
tss 59
polkitd 998
geoclue 997
rtkit 172
pulse 171
libstoragemgmt 996
qemu 107
usbmuxd 113
unbound 995
rpc 32
gluster 994
chrony 993
setroubleshoot 992
pipewire 991
saslauth 990
dnsmasq 984
radvd 75
clevis 983
cockpit-ws 982
cockpit-wsinstance 981
sssd 980
flatpak 979
colord 978
gdm 42
rpcuser 29
gnome-initial-setup 977
sshd 74
avahi 70
rngd 976
tcpdump 72
wang 1000
- 打印title, 凑出表格形式
[root@demo-c8 ~]# awk -F":" 'BEGIN{print "|---------------------------|\n| Usename&UID |\n-----------------------------"}{printf "|%-20s|%-6s|\n-----------------------------\n", $1,$3 }' /etc/passwd
|---------------------------|
| Usename&UID |
-----------------------------
|root |0 |
-----------------------------
|bin |1 |
-----------------------------
|daemon |2 |
-----------------------------
|adm |3 |
-----------------------------
|lp |4 |
-----------------------------
|sync |5 |
-----------------------------
|shutdown |6 |
-----------------------------
|halt |7 |
-----------------------------
|mail |8 |
-----------------------------
|operator |11 |
-----------------------------
|games |12 |
-----------------------------
|ftp |14 |
-----------------------------
|nobody |65534 |
-----------------------------
|dbus |81 |
-----------------------------
|systemd-coredump |999 |
-----------------------------
|systemd-resolve |193 |
-----------------------------
|tss |59 |
-----------------------------
|polkitd |998 |
-----------------------------
|geoclue |997 |
-----------------------------
|rtkit |172 |
-----------------------------
|pulse |171 |
-----------------------------
|libstoragemgmt |996 |
-----------------------------
|qemu |107 |
-----------------------------
|usbmuxd |113 |
-----------------------------
|unbound |995 |
-----------------------------
|rpc |32 |
-----------------------------
|gluster |994 |
-----------------------------
|chrony |993 |
-----------------------------
|setroubleshoot |992 |
-----------------------------
|pipewire |991 |
-----------------------------
|saslauth |990 |
-----------------------------
|dnsmasq |984 |
-----------------------------
|radvd |75 |
-----------------------------
|clevis |983 |
-----------------------------
|cockpit-ws |982 |
-----------------------------
|cockpit-wsinstance |981 |
-----------------------------
|sssd |980 |
-----------------------------
|flatpak |979 |
-----------------------------
|colord |978 |
-----------------------------
|gdm |42 |
-----------------------------
|rpcuser |29 |
-----------------------------
|gnome-initial-setup |977 |
-----------------------------
|sshd |74 |
-----------------------------
|avahi |70 |
-----------------------------
|rngd |976 |
-----------------------------
|tcpdump |72 |
-----------------------------
|wang |1000 |
-----------------------------
1.5 操作符
算数运算符:
x+y, x-y, x*y, x/y, x^y, x%y
-x:转换为负数
+x:将字符串转换为数值
字符串操作符: 没有符号的操作符, 字符串连接
赋值操作符:
=, +=, -=, *=, /=, %=, ^=,++, --
范例: 自增
[19:29:30 root@centos7 ~]#awk 'BEGIN{i=0;print i++,i}' # 先打印i, 再自增, 然后重新打印
0 1
[19:29:49 root@centos7 ~]#awk 'BEGIN{i=0;print ++i,i}' # 先自增打印i, 然后再打印一遍i
1 1
比较操作符:
==, !=, >, >=, <, <=
范例: 打印/etc/issue文件的第二行record
[19:36:17 root@centos7 ~]#awk 'NR==2' /etc/issue
Kernel \r on an \m
范例: 打印UID>=1000的record
[19:36:26 root@centos7 ~]#awk '$3>=1000' /etc/passwd
systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
polkitd:x:999:998:User for polkitd:/:/sbin/nologin
范例: 取奇数, 偶数行
[19:38:39 root@centos7 ~]#seq 10 | awk 'NR%2==0'
2
4
6
8
10
[19:39:16 root@centos7 ~]#seq 10 | awk 'NR%2==1'
1
3
5
7
9
[19:39:19 root@centos7 ~]#seq 10 | awk 'NR%2!=0'
1
3
5
7
9
范例: 取ip地址
[19:45:44 root@centos7 ~]#ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.192.7 netmask 255.255.255.0 broadcast 192.168.192.255
inet6 fe80::20c:29ff:fe0d:c854 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:0d:c8:54 txqueuelen 1000 (Ethernet)
RX packets 3731 bytes 324962 (317.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1901 bytes 235371 (229.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[19:45:29 root@centos7 ~]#ifconfig eth0 | awk 'NR==2{print $2}'
192.168.192.7
模式匹配符:
~ 左边是否和右边匹配,包含关系. 配合正则表达式, 判断左边内容, 是否包含右边内容
!~ 是否不匹配, 左边内容, 是否不包含右边内容
格式:
awk 选项 '$指定列范围 ~ /指定的列范围是否包含该字符串/{ACTION}' 文件路径
范例:
[root@demo-c8 opt]# awk -F":" '$0 ~ /root/{print $1}' /etc/passwd
root
operator
条件表达式(三目表达式):
selector?if-true-expression:if-false-expression
- select: 模式匹配
- if-true-expression: 行匹配成功, 则执行该命令
- if-false-expression: 行匹配不成功, 则执行该命令
范例: 计算每一行第三列的值是否大于等于1000, 如果是, 则给usertype赋值为"Common User", 如果不是, 赋值为"SysUser". 最后打印$1
的内容, 以及usertype
变量的值
[root@demo-c8 opt]# awk -F: '{$3>=1000?usertype="Common User":usertype="SysUser";printf "%-20s:%12s\n",$1,usertype}' /etc/passwd
root : SysUser
bin : SysUser
daemon : SysUser
adm : SysUser
lp : SysUser
sync : SysUser
shutdown : SysUser
halt : SysUser
mail : SysUser
operator : SysUser
games : SysUser
ftp : SysUser
nobody : Common User
dbus : SysUser
systemd-coredump : SysUser
systemd-resolve : SysUser
tss : SysUser
polkitd : SysUser
geoclue : SysUser
rtkit : SysUser
pulse : SysUser
libstoragemgmt : SysUser
qemu : SysUser
usbmuxd : SysUser
unbound : SysUser
rpc : SysUser
gluster : SysUser
chrony : SysUser
setroubleshoot : SysUser
pipewire : SysUser
saslauth : SysUser
dnsmasq : SysUser
radvd : SysUser
clevis : SysUser
cockpit-ws : SysUser
cockpit-wsinstance : SysUser
sssd : SysUser
flatpak : SysUser
colord : SysUser
gdm : SysUser
rpcuser : SysUser
gnome-initial-setup : SysUser
sshd : SysUser
avahi : SysUser
rngd : SysUser
tcpdump : SysUser
wang : Common User
1.6 模式PATTERN
PATTERN: 根据pattern条件,过滤匹配的行,再做处理
- 如果未指定:空模式,匹配每一行
范例:
[root@demo-c8 ~]# awk -F":" '{print $1,$3}' /etc/passwd
root 0
bin 1
daemon 2
adm 3
lp 4
sync 5
shutdown 6
...
-
/regular expression/
:仅处理能被模式匹配到的行,需要用/ /
括起来
- 如果省略了action, 那么默认会执行
print $0
[root@demo-c8 ~]# awk '/^UUID/' /etc/fstab
UUID=b1ab1ace-2582-4afd-8693-39bd9855041c / xfs defaults 0 0
UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot ext4 defaults 1 2
UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data xfs defaults 0 0
UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap swap defaults 0 0
范例: 取分区利用率数字
[root@demo-c8 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 3957244 0 3957244 0% /dev
tmpfs 3985412 0 3985412 0% /dev/shm
tmpfs 3985412 9832 3975580 1% /run
tmpfs 3985412 0 3985412 0% /sys/fs/cgroup
/dev/sda2 41922560 4561428 37361132 11% /
/dev/sda5 41922560 325332 41597228 1% /data
/dev/sda1 999320 192552 737956 21% /boot
tmpfs 797080 1168 795912 1% /run/user/42
tmpfs 797080 4 797076 1% /run/user/0
[root@demo-c8 ~]# df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{print $5}'
11
1
21
[root@demo-c8 ~]# df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{print $5}' | sort | df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{print $5}' | sort | tail -n1
21
- relational expression: 关系表达式,结果为“真”才会被处理
真:结果为非0值,非空字符串
假:结果为空字符串或0值
范例: !0表示真, !1表示假. 使用数字时, 不用加双引号""
[root@demo-c8 ~]# awk '!0' /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin
systemd-resolve:x:193:193:systemd Resolver:/:/sbin/nologin
tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
polkitd:x:998:996:User for polkitd:/:/sbin/nologin
geoclue:x:997:995:User for geoclue:/var/lib/geoclue:/sbin/nologin
rtkit:x:172:172:RealtimeKit:/proc:/sbin/nologin
pulse:x:171:171:PulseAudio System Daemon:/var/run/pulse:/sbin/nologin
libstoragemgmt:x:996:992:daemon account for libstoragemgmt:/var/run/lsm:/sbin/nologin
qemu:x:107:107:qemu user:/:/sbin/nologin
usbmuxd:x:113:113:usbmuxd user:/:/sbin/nologin
unbound:x:995:990:Unbound DNS resolver:/etc/unbound:/sbin/nologin
rpc:x:32:32:Rpcbind Daemon:/var/lib/rpcbind:/sbin/nologin
gluster:x:994:989:GlusterFS daemons:/run/gluster:/sbin/nologin
chrony:x:993:988::/var/lib/chrony:/sbin/nologin
setroubleshoot:x:992:986::/var/lib/setroubleshoot:/sbin/nologin
pipewire:x:991:985:PipeWire System Daemon:/var/run/pipewire:/sbin/nologin
saslauth:x:990:76:Saslauthd user:/run/saslauthd:/sbin/nologin
dnsmasq:x:984:984:Dnsmasq DHCP and DNS server:/var/lib/dnsmasq:/sbin/nologin
radvd:x:75:75:radvd user:/:/sbin/nologin
clevis:x:983:982:Clevis Decryption Framework unprivileged user:/var/cache/clevis:/sbin/nologin
cockpit-ws:x:982:980:User for cockpit web service:/nonexisting:/sbin/nologin
cockpit-wsinstance:x:981:979:User for cockpit-ws instances:/nonexisting:/sbin/nologin
sssd:x:980:978:User for sssd:/:/sbin/nologin
flatpak:x:979:977:User for flatpak system helper:/:/sbin/nologin
colord:x:978:976:User for colord:/var/lib/colord:/sbin/nologin
gdm:x:42:42::/var/lib/gdm:/sbin/nologin
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
gnome-initial-setup:x:977:975::/run/gnome-initial-setup/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
avahi:x:70:70:Avahi mDNS/DNS-SD Stack:/var/run/avahi-daemon:/sbin/nologin
rngd:x:976:974:Random Number Generator Daemon:/var/lib/rngd:/sbin/nologin
tcpdump:x:72:72::/:/sbin/nologin
wang:x:1000:1000:wang:/home/wang:/bin/bash
[root@demo-c8 ~]# awk '!1' /etc/passwd
范例: !n++
表达式
- awk先判断n的值, n=0, !n=1, 为真, 所以会处理第一行
- 处理完第一行, 计算n++, n=2, !n=0, 为假, 则不处理第二行
- 第二行虽然不处理, 但是因为是循环处理每一行, 所以之后还要判断n, n=2, n++=3, !n=0, 为假, 所以不处理第三行, 后续所有行因为n都是为0, !n则为假, 也就都不处理了
[root@demo-c8 ~]# awk -v n=0 '!n++' /etc/passwd
root:x:0:0:root:/root:/bin/bash
范例: !++n
表达式
- n=0, 先计算++n, n=1, !n=0, 为假, 则不处理文本
[root@demo-c8 opt]# awk -v n=0 '!++n' /etc/passwd
[root@demo-c8 opt]#
总结: 当pattern使用关系表达式时, 处理文本就相当于循环处理每一行, 每次处理一行内容之前, 要先判断表达式的返回值, 返回真, 则处理, 返回假, 则不处理. 处理完一行, 接着再次判断表达式的返回值, 然后再处理或不处理
范例: i=!i
, 打印奇数行
-
i
起初没有赋值, 为假, 那么!i
就为真, 所以会处理第一行, 此时i=真
- 之后, 因为
i
为真, 那么!i
就为假, 所以不会处理第二行, 此时i=假
- 接着再判断表达式, 此时
i=假
, 那么!i则为真, 处理第三行, 也就是只处理奇数行
[root@demo-c8 opt]# seq 10 | awk 'i=!i'
1
3
5
7
9
范例: 打印偶数行
先将i
初始化为1, 真, 那么!i
就是假, 和上面相反, 结果就是只处理偶数行
[root@demo-c8 opt]# seq 10 | awk -v i=1 'i=!i'
2
4
6
8
10
[root@demo-c8 opt]# seq 10 | awk '!(i=!i)'
2
4
6
8
10
- line ranges:行范围
不支持直接用行号,但可以使用变量NR间接指定行号
范例:
[root@demo-c8 opt]# seq 10 | awk 'NR>=3 && NR<=6'
3
4
5
6
/pat1/,/pat2/ 不支持直接给出数字格式, 但支持字符匹配
范例:
[root@demo-c8 opt]# awk '/^root/,/^nobody/' /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
2. AWK条件判断和循环控制
2.1 条件判断if-else
语法:
if(condition)statement
if(condition){statement;…}[{else statement}
if(condition1){statement1}else if(condition2){statement2}else if(condition3){statement3}......else{statementN}
使用场景: 对awk取得的整行或某个字段做条件判断
范例: '{if(condition)statement}'
[root@demo-c8 opt]# awk -F":" '{if($NF=="/bin/bash")print $1}' /etc/passwd
root
wang
[root@demo-c8 opt]# awk -F":" '{if($NF>5)print $0}' /etc/fstab
UUID=b1ab1ace-2582-4afd-8693-39bd9855041c / xfs defaults 0 0
UUID=d5131695-82b3-4a23-bc28-5c8a4bf381a0 /boot ext4 defaults 1 2
UUID=bdd66510-e510-4fe7-ba71-e2a35e6dc492 /data xfs defaults 0 0
UUID=05c944fb-d6f9-4544-ba10-8b7bf3cc8fed swap swap defaults 0 0
[root@demo-c8 opt]# awk -F":" '{if($3>=1000)print $1,$3}' /etc/passwd
nobody 65534
wang 1000
范例: '{if(condition1){statement1}else{statement2}}'
[root@demo-c8 opt]# awk -F":" '{if($3>=1000){printf "%-20s %s\n", $1,$3}else{printf "%-20s %s\n", $1, "common user"}}' /etc/passwd
root common user
bin common user
daemon common user
adm common user
lp common user
sync common user
shutdown common user
halt common user
mail common user
operator common user
games common user
ftp common user
nobody 65534
dbus common user
systemd-coredump common user
systemd-resolve common user
tss common user
polkitd common user
geoclue common user
rtkit common user
pulse common user
libstoragemgmt common user
qemu common user
usbmuxd common user
unbound common user
rpc common user
gluster common user
chrony common user
setroubleshoot common user
pipewire common user
saslauth common user
dnsmasq common user
radvd common user
clevis common user
cockpit-ws common user
cockpit-wsinstance common user
sssd common user
flatpak common user
colord common user
gdm common user
rpcuser common user
gnome-initial-setup common user
sshd common user
avahi common user
rngd common user
tcpdump common user
wang 1000
范例: awk取分区利用率大于10%的分区
[root@demo-c8 opt]# df
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 3957244 0 3957244 0% /dev
tmpfs 3985412 0 3985412 0% /dev/shm
tmpfs 3985412 9832 3975580 1% /run
tmpfs 3985412 0 3985412 0% /sys/fs/cgroup
/dev/sda2 41922560 4561796 37360764 11% /
/dev/sda5 41922560 325332 41597228 1% /data
/dev/sda1 999320 192552 737956 21% /boot
tmpfs 797080 1168 795912 1% /run/user/42
tmpfs 797080 4 797076 1% /run/user/0
[root@demo-c8 opt]# df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{if($5>10)print $1, $5}'
/dev/sda2 11
/dev/sda1 21
[ %]+ : 表示>=1个空格, 或者>=1个百分号
[[:space:]]+|% : 表示>=1个空格, 或者一个百分号
" +|%" : 表示>=1个空格, 或者>=1个百分号
2.2 switch语句
类似Shell中的case表达式
语法:
switch(expression) {case VALUE1 or /REGEXP/: statement1; case VALUE2 or
/REGEXP2/: statement2; ...; default: statementn}
2.3 while循环
条件“真”,进入循环;条件“假”,退出循环
使用场景:
对一行内的多个字段逐一类似处理时使用
对数组中的各元素逐一处理时使用
语法:
while (condition) {statement;…}
范例: length()内置函数, 返回字符个数
[root@demo-c8 opt]# awk 'BEGIN{print length("您好世界")}'
4
[root@demo-c8 opt]# awk 'BEGIN{print length("hello world")}'
11
范例: awk逐列处理
[20:08:29 root@centos-7-6 ~]#awk '/^[[:space:]]*linux16/{i=1;while(i<=NF){print $i,length($i); i++}}' /etc/grub2.cfg
linux16 7
/vmlinuz-3.10.0-1127.el7.x86_64 31
root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
ro 2
crashkernel=auto 16
rhgb 4
quiet 5
net.ifnames=0 13
linux16 7
/vmlinuz-0-rescue-e5ad2c92a25b4c87b7cca719c77091ff 50
root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
ro 2
crashkernel=auto 16
rhgb 4
quiet 5
net.ifnames=0 13
[20:08:37 root@centos-7-6 ~]#awk '/^[[:space:]]*linux16/{i=1;while(i<=NF) {if(length($i)>=10){print $i,length($i)}; i++}}' /etc/grub2.cfg
/vmlinuz-3.10.0-1127.el7.x86_64 31
root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
crashkernel=auto 16
net.ifnames=0 13
/vmlinuz-0-rescue-e5ad2c92a25b4c87b7cca719c77091ff 50
root=UUID=ffd4773d-a205-4937-bdff-4f80e73b6ad0 46
crashkernel=auto 16
net.ifnames=0 13
范例: 打印1,2,..100
[root@demo-c8 opt]# awk -v i=1 'BEGIN{while(i<=100){print i;i++}}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
注意:
1. awk在打印变量时, 不用加`$`符合. 位置变量除外
范例: 打印1-100的和
[root@demo-c8 opt]# awk -v i=1 -v sum=0 'BEGIN{while (i<=100){sum+=i; i++};print sum}'
5050
2.4 循环do-while
语法:
do {statement;…}while(condition)
意义: 无论真假,至少执行一次循环体
do-while循环
语法:do {statement;…}while(condition)
意义:无论真假,至少执行一次循环体
范例:
[root@demo-c8 opt]# awk 'BEGIN{ total=0;i=1;do{ total+=i;i++;}while(i<=100);print total}'
5050
2.5 循环for
语法:
for(expr1;expr2;expr3) {statement;…}
常见用法:
for(variable assignment;condition;iteration process) {for-body}
特殊用法:能够遍历数组中的元素
范例: awk的for循环, 打印1-100
[root@demo-c8 opt]# awk 'BEGIN{for(i=1;i<=100;i++){print i}}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
范例: awk的for循环, 计算1-100的和
[root@demo-c8 opt]# awk 'BEGIN{sum=0;for(i=1;i<=100;i++){sum+=i};print sum}'
5050
范例: awk
, Shell
, bc
效率对比, 计算1+...+1000000的和
[root@demo-c8 opt]# time (awk 'BEGIN{sum=0;for(i=1;i<=1000000;i++){sum+=i};print sum}')
500000500000
real 0m0.079s
user 0m0.076s
sys 0m0.003s
[root@demo-c8 opt]# time (sum=0; for((i=1;i<=1000000;i++));do let sum+=i;done;echo $sum)
500000500000
real 0m5.446s
user 0m5.408s
sys 0m0.000s
[root@demo-c8 opt]# time (seq -s+ 1000000 |bc)
500000500000
real 0m0.432s
user 0m0.343s
sys 0m0.150s
2.6 continue和break
continue 中断本次循环
break 中断整个循环
格式:
continue [n]
break [n]
范例: continue, 打印1-100奇数的和
[root@demo-c8 opt]# awk -v sum=0 'BEGIN{for(i=1;i<=100;i++){if(i%2==0)continue; sum+=i};print sum}'
2500
范例: break, 打印1-49
[root@demo-c8 opt]# awk 'BEGIN{for(i=1;i<=100;i++){if (i<=50){break}else{print i}}}'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
2.7 next
next 可以提前结束对本行处理而直接进入下一行处理(awk自身循环)
范例: 打印UID是奇数的用户名和UID
[root@demo-c8 opt]# awk -F":" '{if($3%2!=0)printf "%-20s %s\n", $1,$3}' /etc/passwd
bin 1
adm 3
sync 5
halt 7
operator 11
dbus 81
systemd-coredump 999
systemd-resolve 193
tss 59
geoclue 997
pulse 171
qemu 107
usbmuxd 113
unbound 995
chrony 993
pipewire 991
radvd 75
clevis 983
cockpit-wsinstance 981
flatpak 979
rpcuser 29
gnome-initial-setup 977
3. 数组
awk的数组都是关联数组
格式:
array_name[index-expression]
范例:
[root@demo-c8 opt]# weekdays["mon"]="monday"
[root@demo-c8 opt]# echo ${weekdays["mon"]}
monday
index-expression
- 利用数组,实现 k/v 功能
- 可使用任意字符串;字符串要使用双引号括起来
- 如果某数组
元素(key)
事先不存在,在引用时,awk会自动创建此元素,并将其值初始化为空串("")
- 若要判断数组中是否存在某元素,要使用
index in array
格式进行遍历. 存在返回1, 不存在返回0
注意: awk的关联数组不支持[*], 取全部值
[root@demo-c8 opt]# awk 'BEGIN{weekdays["mon"]="monday";weekdays["tue"]="tuesday";weekdays["wed"]="wednesday";print weekdays["mon"]}'
monday
范例: 判断数组是否包含某个key
[root@demo-c8 opt]# awk 'BEGIN{weekdays["mon"]="monday";weekdays["tue"]="tuesday";weekdays["wed"]="wednesday";print "thur" in weekdays; print "mon" in weekdays}'
0
1
若要遍历数组中的每个value, 可以使用for循环. for循环每次遍历时, 会取出数组的key.
范例:
[root@demo-c8 opt]# awk 'BEGIN{user["name"]="root"; user["uid"]="0"; user["password"]="Y"; for(i in user){print user[i]}}'
Y
root
0
范例: 显示主机的连接状态出现的次数, 并排序
- ss命令输出格式
[root@demo-c8 opt]# ss -nta | awk 'NR!=1{print $1}' | sort | uniq -c
1 ESTAB
7 LISTEN
-
state[$1]++
: 创建一个awk数组, 名字为state. 按照$1
, 把不同的状态, 作为数组的key,++
表示分别按照不同的状态, 进行次数累计. 最后得到的数组就是每个状态对应的次数
[root@demo-c8 opt]# ss -ant | awk 'NR!=1{state[$1]++}END{for(i in state){print i,state[i]}}'
LISTEN 7
ESTAB 1
范例: 统计访问网站的ip的次数
[root@demo-c8 opt]# cat access_log | head -n1
172.18.118.91 - - [20/May/2018:08:09:59 +0800] "GET / HTTP/1.1" 200 912 "-" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 5.1; Trident/5.0)"
[root@demo-c8 opt]# awk '{ip[$1]++}END{for(i in ip){printf "%-20s %s\n", ip[i], i}}' access_log | sort -nr | head -n5
4870 172.20.116.228
3429 172.20.116.208
2834 172.20.0.222
2613 172.20.112.14
2267 172.20.0.227
范例: 利用/etc/hosts.deny文件拒绝其他服务器的SSH访问
-
/etc/hosts.deny
文件利用的是tcpwrapper
技术, 但是仅在CentOS7和较老版本生效, CentOS8开始不支持
[20:10:41 root@centos-7-6 ~]#echo "sshd: 10.0.0.108" >> /etc/hosts.deny
[root@demo-c8 opt]# ssh root@10.0.0.187
kex_exchange_identification: read: Connection reset by peer
4. awk函数
awk 的函数分为内置和自定义函数
4.1 常见内置函数
数值处理:
- rand():返回0和1之间一个随机数
- srand():配合rand() 函数,生成随机数的种子
- int():返回整数部分
范例: rand()和srand()要配套使用, 否则rand()只会返回固定数值. 并且, 此命令不能执行过快, 否则只会返回相同的值
[22:42:27 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
0.482749
[22:42:40 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
0.12631
[22:42:42 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
0.12631
[22:42:42 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
0.12631
[22:42:42 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
0.0949145
[22:42:44 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
0.0949145
[22:42:44 root@centos7 ~]#awk 'BEGIN{srand();print rand()}'
0.25884
范例: 打印100以内整数十次
[22:45:52 root@centos7 ~]#awk 'BEGIN{srand();for(i=1;i<=10;i++)print int(rand()*100)}'
13
5
63
15
36
82
91
33
86
31
字符串处理:
- length([s]):返回指定字符串的长度
- sub(r,s,[t]):对t字符串搜索r表示模式匹配的内容,并将第一个匹配内容替换为s
范例: sub字符替换
[22:46:11 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk 'sub(/:/,"-",$1)'
2008-08:08 08:08:08
[22:47:26 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk '{sub(/:/,"-",$1);print $0}'
2008-08:08 08:08:08
- gsub(r,s,[t]):对t字符串进行搜索r表示的模式匹配的内容,并全部替换为s所表示的内容
范例:
[22:48:10 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk 'gsub(/:/,"-",$0)'
2008-08-08 08-08-08
[22:49:00 root@centos7 ~]#echo "2008:08:08 08:08:08" | awk '{gsub(/:/,"-",$0);print $0}'
2008-08-08 08-08-08
split(s,array,[r]):以r为分隔符,切割字符串s,并将切割后的结果保存至array所表示的数组中,第一个索引值为1,第二个索引值为2,…
范例:
[22:49:47 root@centos7 ~]#netstat -nt | awk '/^tcp/{split($5,ip,":");count[ip[1]]++}END{for(i in count){print i,count[i]}}'
192.168.192.1 1
system 函数:可以awk中调用shell命令
空格是awk中的字符串连接符,如果system中需要使用awk中的变量可以使用空格分隔,或者说除了awk的变量外其他一律用""引用起来
[22:50:43 root@centos7 ~]#awk 'BEGIN{system("hostname")}'
centos7.mac
[22:50:48 root@centos7 ~]#awk 'BEGIN{score=100; system("echo your score is " score) }'
your score is 100
范例: 统计连接次数大于等于10的ip, 并且添加到iptables
[root@centos8 ~]#netstat -tn | awk
'/^tcp/{split($5,ip,":");count[ip[1]]++}END{for(i in count){if(count[i]>=10)
{system("iptables -A INPUT -s "i" -j REJECT")}}}'
4.2 自定义函数
自定义函数格式:
function name ( parameter, parameter, ... ) {
statements
return expression
}
范例: 统计两个数谁大谁小
[22:53:20 root@centos7 ~]#vim func.awk
function max(x,y) {
x>y?var=x:var=y
return var
}
BEGIN{print max(a,b)}
[22:53:52 root@centos7 ~]#awk -v a=30 -v b=20 -f func.awk
30
5. awk脚本
- 将awk程序写成脚本,直接调用或执行
范例:
[22:56:52 root@centos7 ~]#vim passwd.awk
{if($3<=1000)print $1,$3}
[22:57:20 root@centos7 ~]#awk -F: -f passwd.awk /etc/passwd
root 0
bin 1
daemon 2
adm 3
lp 4
sync 5
shutdown 6
halt 7
mail 8
operator 11
games 12
ftp 14
nobody 99
systemd-network 192
dbus 81
polkitd 999
sshd 74
postfix 89
tcpdump 72
范例:
[22:57:21 root@centos7 ~]#vim test.awk
#!/bin/awk -f
#this is a awk script
{if($3<=1000)print $1,$3}
[22:58:45 root@centos7 ~]#./test.awk -F: /etc/passwd
root 0
bin 1
daemon 2
adm 3
lp 4
sync 5
shutdown 6
halt 7
mail 8
operator 11
games 12
ftp 14
nobody 99
systemd-network 192
dbus 81
polkitd 999
sshd 74
postfix 89
tcpdump 72
- 向awk脚本传递参数
格式:
awkfile var=value var2=value2... Inputfile
注意:在BEGIN过程中不可用。直到首行输入完成以后,变量才可用。可以通过-v 参数,让awk在执行BEGIN之前得到变量的值。命令行中每一个指定的变量都需要一个-v参数
范例:
[22:58:46 root@centos7 ~]#vim test2.awk
#!/bin/awk -f
{if($3 >=min && $3<=max)print $1,$3}
[23:00:17 root@centos7 ~]#chmod +x test2.awk
[23:00:18 root@centos7 ~]#./test2.awk -F: min=100 max=200 /etc/passwd
systemd-network 192
练习:
1、文件host_list.log 如下格式,请提取”.magedu.com”前面的主机名部分并写入到回到该文件中
1 www.magedu.com
2 blog.magedu.com
3 study.magedu.com
4 linux.magedu.com
5 python.magedu.com
......
999 study.magedu.com
2、统计/etc/fstab文件中每个文件系统类型出现的次数
3、统计/etc/fstab文件中每个单词出现的次数
4、提取出字符串Yd$C@M05MB%9&Bdh7dq+YVixp3vpw
中的所有数字
5、有一文件记录了1-100000之间随机的整数共5000个,存储的格式100,50,35,89…请取出其中最大和最小的整数
6、解决Dos攻击生产案例:根据web日志或者或者网络连接数,监控当某个IP并发连接数或者短时内PV达到100,即调用防火墙命令封掉对应的IP,监控频率每隔5分钟。防火墙命令为:iptables -A INPUT -s IP -j REJECT
7、将以下文件内容中FQDN取出并根据其进行计数从高到低排序
http://mail.magedu.com/index.html
http://www.magedu.com/test.html
http://study.magedu.com/index.html
http://blog.magedu.com/index.html
http://www.magedu.com/images/logo.jpg
http://blog.magedu.com/20080102.html
http://www.magedu.com/images/magedu.jpg
参考答案:
[root@centos8 ~]#awk -F"/" '{url[$3]++}END{for(i in url){print url[i],i}}'
url.log |sort -nr
3 www.magedu.com
2 blog.magedu.com
1 study.magedu.com
1 mail.magedu.com
8、将以下文本以inode为标记,对inode相同的counts进行累加,并且统计出同一inode中,beginnumber的最小值和endnumber的最大值
inode|beginnumber|endnumber|counts|
106|3363120000|3363129999|10000|
106|3368560000|3368579999|20000|
310|3337000000|3337000100|101|
310|3342950000|3342959999|10000|
310|3362120960|3362120961|2|
311|3313460102|3313469999|9898|
311|3313470000|3313499999|30000|
311|3362120962|3362120963|2|
输出的结果格式为:
310|3337000000|3362120961|10103|
311|3313460102|3362120963|39900|
106|3363120000|3368579999|30000|