Learning and Practicing
This article describes commands related to text processing in Shell programming, such as grep, sed, and awk
-
grep:
-
<span>grep is a text search tool used to search for lines in files that match one or more regular expressions.</span> -
<span>It is fast, flexible, and the standard tool for text searching.</span> -
<span>grep supports various options, such as case-insensitive search, recursive directory search, and using regular expressions.</span> -
<span>It is commonly used to quickly find lines in files that contain specific text.</span>
sed:
-
<span>sed</span>(Stream Editor) is a stream editor used for filtering and transforming text data. -
It executes editing commands according to specified patterns, such as inserting, deleting, replacing, and transforming.
-
<span>sed</span>scripts can be very concise, suitable for simple text replacements and pattern matching. -
It is commonly used for automating text editing tasks.
awk:
-
<span>awk</span>is a powerful text processing tool, especially suitable for handling structured data, such as tabular data. -
It can read files and perform pattern scanning and processing on the data.
-
<span>awk</span>supports complex conditional statements, loops, arrays, and other programming structures. -
It is commonly used for data extraction, sorting, calculations, etc.
<span><span>Introduction to the grep Command</span></span> Command
<span>grep</span> is a very powerful text search tool in Linux systems that can quickly search for specified strings or patterns in files and output matching lines.<span>grep</span> stands for “Global Regular Expression Print,” meaning global regular expression print.
Syntax Structure
grep [OPTIONS] PATTERN [FILE...]
-
OPTIONS: Optional command-line options.
-
PATTERN: The pattern to search for, which can be a simple string or a complex regular expression.
-
FILE: The file or directory to search. If no file is specified, the default is standard input.
Common Options
-
<span>-i</span>: Ignore case differences, making the search case-insensitive. -
<span>-v</span>: Display lines that do not contain the matching text. -
<span>-n</span>: Display the line number of matching lines. -
<span>-c</span>: Only display the number of matching lines, not the line content. -
<span>-r</span>or<span>-R</span>: Recursively search through all files in the specified directory. -
<span>-e</span>: Allow searching for multiple patterns simultaneously. -
<span>-l</span>: Only display the names of files containing the matching text. -
<span>--color=auto</span>: Display matching text in color for easier distinction.
Examples
-
Search for a string in a single file:
grep "hello" file.txt -
Search for a string in multiple files:
grep "hello" file1.txt file2.txt -
Search for multiple patterns simultaneously:
grep -e "hello" -e "world" file.txt -
Display matching line numbers:
grep -n "hello" file.txt -
Display non-matching lines:
grep -v "hello" file.txt -
Recursively search for a string in a directory:
grep -r "hello" directory/ -
Only display the names of files containing matching text:
grep -l "hello" directory/ -
Count the number of matching lines:
grep -c "hello" file.txt -
Use regular expressions to search (assuming
<span>PATTERN</span>is a regular expression):grep -E "regexp" file.txt -
Highlight matching text:
grep --color=auto "hello" file.txt
<span>grep</span> command is very flexible and can perform complex text search tasks by combining different options and regular expressions.
Mastering the <span>grep</span> command is very helpful for processing and analyzing text data.
<span><span>Introduction to the sed Command</span></span>
<span>sed</span> (Stream Editor) is a stream editor used to perform basic text transformations. It reads input text (from files or standard input), executes operations according to specified instructions, and outputs the results to standard output (or files).
Syntax Structure
sed [OPTIONS] 'COMMANDS' [FILE...]
-
OPTIONS: Optional command-line options.
-
COMMANDS: The sequence of
<span>sed</span>commands to execute, usually enclosed in single quotes. -
FILE: The file or list of files to process. If not specified,
<span>sed</span>will read from standard input.
Common Commands
-
<span>s</span>: Replace the specified pattern. -
<span>d</span>: Delete the specified line. -
<span>i</span>: Insert text before the specified line. -
<span>c</span>: Replace the specified line. -
<span>y</span>: Character transformation. -
<span>p</span>: Print line. -
<span>a</span>: Append text after the specified line. -
<span>q</span>: Quit<span>sed</span>.
Options
-
<span>-i</span>: Modify the file directly instead of outputting to standard output. -
<span>-n</span>: Only print lines processed by<span>sed</span>commands. -
<span>-e</span>: Execute a<span>sed</span>command script.
Examples
-
Replace a specified string in a file:
sed 's/hello/world/' file.txt -
Delete a specified line in a file:
sed '3d' file.txt -
Insert a specified string after a specified line in a file:
sed '2i\hello world' file.txt -
Replace the content of a specified line in a file:
sed '3c\hello world' file.txt -
Append text after a specified line in a file:
sed '2a\ This is a new line added after line 2\' file.txt -
Print matching lines in a file:
sed -n '/world/p' file.txt -
Character transformation:
sed 'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/' file.txt -
Use regular expressions for replacement:
sed 's/[0-9]/& /g' file.txt # Add space after numbers -
Use pattern matching for deletion:
sed '/^$/d' file.txt # Delete empty lines -
Modify the file directly:
sed -i 's/old/new/' file.txt -
Execute multiple commands:
sed -e 's/old/new/' -e 's/yes/no/' file.txt -
Print non-matching lines:
sed -n '/old/!p' file.txt
<span>sed</span> is a powerful text processing tool that can be used not only for simple text replacements and deletions but also for executing more advanced text processing tasks by writing complex scripts.
Mastering the <span>sed</span> command is very useful for automating text editing and data cleaning.
<span><span>Introduction to the awk Command</span></span>
<span>awk</span> is a powerful text processing tool that can not only format and filter text data but also perform complex calculations and data processing.<span>awk</span> is named after its authors Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan.
Syntax Structure
awk [OPTIONS] 'PROGRAM' [FILE...]
-
OPTIONS: Optional command-line options.
-
PROGRAM: The
<span>awk</span>program, containing patterns and actions to execute, usually enclosed in single quotes. -
FILE: The file or list of files to process. If not specified,
<span>awk</span>will read from standard input.
Common Options
-
<span>-F</span>: Set the input field separator. -
<span>-v</span>: Pass external variables to the<span>awk</span>program. -
<span>--field-separators</span>: Set field separators. -
<span>--record-separators</span>: Set record separators.
Basic Elements
-
PATTERN: Can be specific text, regular expressions, or empty (indicating all records).
-
ACTION: The command or code block executed when the pattern matches.
Examples
-
Print all lines in a file:
awk '{print}' file.txt -
Print the content of the second column in a file:
awk '{print $2}' file.txt -
Calculate the sum of all numbers in a file:
awk '{sum += $1} END {print sum}' file.txt -
Print lines containing a specified string in a file:
awk '/hello/ {print}' file.txt -
Add a specified string after a specified column in a file:
awk '{$3 = $3 "hello"} {print}' file.txt -
Use field separators:
awk -F":" '{print $1}' /etc/passwd -
Conditional statements:
awk '{if ($1 > 100) print $0}' file.txt -
Loop structures:
awk '{for (i = 1; i <= NF; i++) print $i}' file.txt -
Formatted output:
awk '{printf "% -15s %-10s\n", $1, $2}' file.txt -
Use external variables:
awk -v var="value" '{print var}' file.txt -
Multiple pattern matching:
awk '/pattern1/,/pattern2/' file.txt -
Custom functions:
awk '{func()} {print} function func() {print "This is a custom function"}' -
Process multiple files:
awk '{...}' file1.txt file2.txt
<span>awk</span> is powerful due to its built-in variables, control flow statements, and functions, allowing it to handle complex text and data.
By writing <span>awk</span> scripts, users can easily achieve data extraction, transformation, and report generation.
Other text processing commands include:uniq, sort, tr, cut, tee, etc.
Previous review:
Introduction to Linux Shell Programming (Part 1)
Upcoming planned updates
-
Regular Expressions: Understand the basic concepts of regular expressions, as they are very useful in text processing.
-
Debugging Techniques: Learn how to use
<span>set -x</span>for debugging and how to resolve issues by reviewing error messages. -
Script Optimization: Learn how to write efficient and readable scripts.
-
Security Best Practices: Learn how to write secure scripts to avoid common security vulnerabilities.
-
Debugging and Testing Shell Scripts: Learn how to test and debug your scripts to ensure they work as expected.
-
Maintenance of Shell Scripts: Understand how to maintain and update your scripts to adapt to changes in the environment.