• 2009-08-17

    comm命令 - [SHELL]

    版权声明:转载时请以超链接形式标明文章原始出处和作者信息及本声明
    http://michaels.blogbus.com/logs/44427299.html

    In our work, we often encounter the following questions:
    在我们的工作中,我经常遇到下面的问题:
    I have two files: file1 and file2:
    有两个文件:文件1和文件2:
    1) How can I print out the lines that are only contained in file1?
    1) 如何打印出只存在于文件1中的内容?
    2) How can I print out the lines that are only contained in file2?
    2) 如何打印出只存在于文件2中的内容?
    3) How can I print out the lines that are contained both in file1 and file2?
    3) 如何打印出文件1和文件2都有的内容?

    There is a powerful shell command that can easily meet our needs, it is: comm.
    这有一个很好的shell命令能够满足我们的需求,它就是comm。

    When you meet the above questions, "comm" should be your first choice:-)
    当你遇到上面的问题,“comm”应该是你第一选择:-)

    comm [ -123 ]??file1??file2

    comm will read file1 and file2 and generate three columns of output:
    comm 将会读取文件1和文件2并且产生三列输出:
    lines only in file1; lines only??in file2; and lines in both files.
    只存在文件1中的行;只存在文件2中的行;两个文件都存在的行。
    For detailed explanation, pls man comm.
    更详细的解释,请参阅man comm。

    Example:
    例如:

    bash-2.03$ cat file1
    11111111
    22222222
    33333333
    44444444
    55555555
    66666666
    77777777
    88888888
    99999999
    bash-2.03$ cat file2
    00000000
    22222222
    44444444
    66666666
    88888888

    1) suppress lines unique to FILE1
    1) 过滤掉file1中的内容
    bash-2.03$ comm -1 file1 file2
    00000000
            22222222
            44444444
            66666666
            88888888

    2) suppress lines unique to FILE2
    2) 过滤掉file2中的内容
    bash-2.03$ comm -2 file1 file2
    11111111
            22222222
    33333333
            44444444
    55555555
            66666666
    77777777
            88888888
    99999999

    3) suppress lines that appear in both files
    3) 过滤掉file1和file2中都有的内容
    bash-2.03$ comm -3 file1 file2
            00000000
    11111111
    33333333
    55555555
    77777777
    99999999

    4) Print out the lines that are only contained in file1?
    4) 打印出只存在于文件1中的内容?
    bash-2.03$ comm -23 file1 file2
    11111111
    33333333
    55555555
    77777777
    99999999

    5) Print out the lines that are only contained in file2?
    5) 打印出只存在于文件2中的内容?
    bash-2.03$ comm -13 file1 file2
    00000000

    6) Print out the lines that are contained both in file1 and file2?
    6) 打印出文件1和文件2都有的内容?
    bash-2.03$ comm -12 file1 file2
    22222222
    44444444
    66666666
    88888888

    Besides the comm, we still have various ways to finish the above tasks.
    除了comm,我们还有其他方法来完成这些任务。

    4) Print out the lines that are only contained in file1?
    4) 打印出只存在于文件1中的内容?
    diff file1 file2 | grep "^<"|sed 's/^< //g'
    for i in $(<file1); do (grep $i file2)||echo $i>>temp ; done;
    cat temp

    In comparison, comm is much easier to remember. :-)
    相比之下,comm更加便于记忆。

    转自:http://hi.baidu.com/will_hu/blog/item/4b05fedf0276fd5fcdbf1a6d.html


    历史上的今天:

    无语 2006-08-17

    收藏到:Del.icio.us