排序和唯一值
sort命令可以帮助我们对文本文件或stdin输入进行排序,sort通常配合其他命令进行输出操作。uniq是一个经常与sort使用的命令。作用是从文本或stdin读取唯一的行,uniq要求输入必须经过排序。
按数字顺序排序:
# sort -n dept
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
逆序排序:
# sort -nr dept
40 OPERATIONS BOSTON
30 SALES CHICAGO
20 RESEARCH DALLAS
10 ACCOUNTING NEW YORK
找出已排序文件中的为一行:
# sort -nr dept | uniq
40 OPERATIONS BOSTON
30 SALES CHICAGO
20 RESEARCH DALLAS
10 ACCOUNTING NEW YORK
使用-k参数指定排序列:
# sort -nrk 1 dept
40 OPERATIONS BOSTON
40 OPERATIONS BOSTON
30 SALES CHICAGO
30 SALES CHICAGO
20 RESEARCH DALLAS
20 RESEARCH DALLAS
10 ACCOUNTING NEW YORK
10 ACCOUNTING NEW YORK
# sort -rk 2 dept
30 SALES CHICAGO
30 SALES CHICAGO
20 RESEARCH DALLAS
20 RESEARCH DALLAS
40 OPERATIONS BOSTON
40 OPERATIONS BOSTON
10 ACCOUNTING NEW YORK
10 ACCOUNTING NEW YORK
使用-b忽略前导空白行,-d指定按字典排序
# sort -bd dept
10 ACCOUNTING NEW YORK
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
20 RESEARCH DALLAS
30 SALES CHICAGO
30 SALES CHICAGO
40 OPERATIONS BOSTON
40 OPERATIONS BOSTON
仅显示没有重复的行:
# sort dept | uniq -u
找出文件中重复的行:
# sort dept | uniq -d
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
统计文件中行出现的次数:
# sort dept | uniq -c
1
2 10 ACCOUNTING NEW YORK
2 20 RESEARCH DALLAS
2 30 SALES CHICAGO
2 40 OPERATIONS BOSTON
以下几个排序、唯一的示例:
# cat dept | tr \”\\t\” \”:\” | sort | uniq -s 2 -w 2
10:ACCOUNTING:NEW YORK
20:RESEARCH:DALLAS
30:SALES:CHICAGO
40:OPERATIONS:BOSTON
-s指定跳过几个字符
-w指定最多使用几个字符