Examples of Using Awk in Linux

Awk is a scripting language used for processing data and generating reports. The awk command programming language does not require compilation and allows users to use variables, numeric functions, string functions, and logical operators.

Awk is a utility that enables programmers to write small but effective programs in the form of statements that define the text patterns to search for in each line of a document, as well as the actions to take when matches are found in a line. Awk is primarily used for pattern scanning and processing. It searches one or more files to see if they contain lines that match a specified pattern and then performs associated actions.

Awk is named after its developers – Aho, Weinberger, and Kernighan.

Awk Syntax #

awk options 'pattern {action}' input > output
Option Description
-F Custom delimiter
-f Read awk program from file
<span>'{}'</span> Action on matched content

What Can Awk Do? #

  1. Awk operation flow

  • Scan files line by line
  • Split each input line into multiple fields
  • Compare input lines/fields with patterns
  • Perform actions on matched lines
  • Uses of Awk

    • Transform data files
    • Generate formatted reports
  • Awk programming structure

    • Set output line format
    • Arithmetic and string operations
    • Conditional statements and loops

    Examples of Awk Commands #

    The following is the content of the input file:

    $ cat info.txt
    朱八 皇帝 客户 45000
    阿三 异族 客户 25000
    赵大 皇帝 售出 50000
    李二凤 皇帝 客户 47000
    增阿牛 侠客 售出 15000
    张三 盲流 售出 23000
    三毛 狗 售出 13000
    二狗 猫 购入  80000
    

    Print All Lines (Default Action) #

    By default, Awk prints every line of data in the specified file.

    $ awk '{print}' info.txt
    
    ### Output
    
    朱八 皇帝 客户 45000
    阿三 异族 客户 25000
    赵大 皇帝 售出 50000
    李二凤 皇帝 客户 47000
    增阿牛 侠客 售出 15000
    张三 盲流 售出 23000
    三毛 狗 售出 13000
    二狗 猫 购入  80000
    

    In the above example, no matching condition is specified. Therefore, the <span>print</span> action applies to all lines. By default, the print action without any parameters prints the entire line, so it prints all lines of the file without failure.

    Keyword Search Lines #

    $ awk '/客户/ {print}' info.txt
    
    ## Output
    
    朱八 皇帝 客户 45000
    阿三 异族 客户 25000
    李二凤 皇帝 客户 47000
    

    In the above example, the awk command prints all lines that match ‘客户’.

    Print Specific Columns #

    For each record (i.e., line), the awk command by default splits it into records separated by space characters and stores them in the <span>$n</span> variables. If the line has 4 words, they will be stored as <span>$1</span>, <span>$2</span>, <span>$3</span>, and <span>$4</span>. Additionally, <span>$0</span> represents the entire line.

    $ awk '{print $1,$4}' info.txt
    
    ## Output
    
    朱八 45000
    阿三 25000
    赵大 50000
    李二凤 47000
    增阿牛 15000
    张三 23000
    三毛 13000
    二狗 80000
    

    In the above example, <span>$1</span> and <span>$4</span> represent the <span>name</span> and <span>strength</span> fields, respectively.

    Built-in Variables of Awk #

    Awk’s built-in variables include field variables — <span>$1</span>, <span>$2</span>, <span>$3</span>, etc. (<span>$0</span> is the entire line), which divide a line of text into individual words or segments called fields.

    • NR NR represents the current count of input records. Remember, records are usually lines. The awk command executes the pattern/action statement once for each record in the file.

    • NF NF represents the number of fields in the current input record (line).

    • FS FS represents the field separator used to divide fields on the input line. The default is “white space”, which means <span>spaces</span> and <span>tab characters</span>. FS can be reassigned to another character (usually in BEGIN) to change the field separator.

    • RS RS represents the current record separator. Since records are lines by default, the default record separator is newline.

    • OFS

      OFS represents the output field separator, which separates fields when Awk prints them. The default is a space. Whenever print has multiple parameters separated by commas, it will print the value of OFS between each parameter.

    • ORS

      ORS represents the output record separator, which separates output lines when Awk prints them. The default is a newline. Print automatically outputs the content of ORS at the end of anything provided to print.

    Example of Using NR (Display Line Numbers) #

    $ awk '{print NR,$0}' info.txt
    ## Output
    1 朱八 皇帝 客户 45000
    2 阿三 异族 客户 25000
    3 赵大 皇帝 售出 50000
    4 李二凤 皇帝 客户 47000
    5 增阿牛 侠客 售出 15000
    6 张三 盲流 售出 23000
    7 三毛 狗 售出 13000
    8 二狗 猫 购入  80000
    

    In the above example, the awk command with NR prints all lines along with their line numbers.

    Example of Using NF (Display Last Field) #

    $ awk '{print $1,$NF}' info.txt
    ## Output
    朱八 45000
    阿三 25000
    赵大 50000
    李二凤 47000
    增阿牛 15000
    张三 23000
    三毛 13000
    二狗 80000
    

    In the above example, <span>$1</span> represents the name, and <span>$NF</span> represents the strength. We can use <span>$NF</span> to get the strength, where <span>$NF</span> represents the last field. The above command has the same effect as <span>awk '{print $1,$4}' info.txt</span>.

    Another Example of NR (Display Lines 3 to 6) #

    $ awk 'NR==3, NR==6 {print NR,$0}' info.txt
    ## Output
    3 赵大 皇帝 售出 50000
    4 李二凤 皇帝 客户 47000
    5 增阿牛 侠客 售出 15000
    6 张三 盲流 售出 23000
    

    <span>NR==3, NR==6</span> is a range pattern that indicates processing records from line 3 (NR==3) to line 6 (NR==6), and <span>{print NR,$0}</span> is the action performed on the matching lines, where <span>print NR</span> prints the current line number, and <span>$0</span> represents the entire line.

    More Awk Usage Examples #

    Print Line Numbers and First Item Content Connected by <span>-</span> #

    $ awk '{print NR "-" $1}' info.txt
    1-朱八
    2-阿三
    3-赵大
    4-李二凤
    5-增阿牛
    6-张三
    7-三毛
    8-二狗
    

    Output Second Item (Column) Content #

    $ awk '{print $2}' info.txt
    皇帝
    异族
    皇帝
    皇帝
    侠客
    盲流
    狗
    猫
    

    Print Any Non-Empty Lines (If Exist) #

    awk 'NF < 0' info.txt
    awk 'NF == 0 {print NR}' info.txt
    awk 'NF <= 0 {print NR}' info.txt
    

    Find the Length of the Longest Line in the File #

    $ awk '{ if (length($0) > max) max = length($0) } END { print max }' info.txt
    15
    

    <span>length($0)</span> indicates using the built-in function <span>length</span> to determine the length of the current line.

    Count the Number of Lines in the File #

    $ awk 'END { print NR }' info.txt
    8
    

    Print Lines with More Than 14 Characters #

    $ awk 'length($0) > 14' info.txt
    李二凤 皇帝 客户 47000
    增阿牛 侠客 售出 15000
    二狗 猫 购入  80000
    

    Filter Specific Data by Condition #

    $ awk '{ if($2 == "侠客") print $0;}' info.txt
    增阿牛 侠客 售出 15000
    

    Print Squares of Numbers from 1 to 9 #

    $ awk 'BEGIN { for(i=1;i<=9;i++) print i,"的平方是",i*i; }'
    1 的平方是 1
    2 的平方是 4
    3 的平方是 9
    4 的平方是 16
    5 的平方是 25
    6 的平方是 36
    7 的平方是 49
    8 的平方是 64
    9 的平方是 81
    

    Conclusion #

    The AWK command is a very simple yet extremely useful utility for any text files, logs, or command line data you are working with. Whether you are a beginner or an experienced system administrator, AWK can help you search, filter, and format data quickly and effectively, making your life easier.

    With AWK, you do not have to write lengthy scripts. A single line of code can produce employee payrolls, delete logs, or even output quick reports. It has pattern recognition capabilities that can split lines into multiple fields and allows you to perform operations such as printing, counting, calculating, and formatting.

    AWK can save time, prevent human errors, and increase productivity on the Linux platform.

    Original article link https://awkgrepsed.com/docs/awk/awk_usage_linux/

    Leave a Comment