1. Introduction to the grep Command
grep is a powerful command used for text searching in Linux/Unix systems. It searches for matching content in files or input streams based on specified patterns (regular expressions or strings) and outputs the lines containing the matches.
2. Basic Syntax of the grep Command
grep [options] pattern [file..]
pattern: a regular expression or string (basic regular expressions are supported by default; use -E to enable extended regular expressions).
file: specifies the files to search. If not specified, it generally reads from standard input through a pipe.
3. Commonly Used Parameters and Examples of the grep Command
(1) -i, ignore case
Example: Search for lines containing the word error in the test.txt file, regardless of case
grep -i “error” test.txt
(2) -v, reverse match, only output lines that do not contain the pattern
Example: Search for lines in the test.txt file that do not contain success
grep -v “success” test.txt
(3) -n, display the line numbers of matching lines
Example: Search for lines containing error in the test.txt file and display the line numbers
grep -n “error” test.txt
(4) -r, recursively search all files in the specified directory and its subdirectories
Example: Search for lines containing error in files under /tmp and all its subdirectories
grep -r “error” /tmp
(5) -l (lowercase, distinct from uppercase L), only output the names of files that contain matches (not the specific lines)
Example: Return the names of files in the current directory that contain error
grep -l “error” ./*
The result will be the current directory + filename
(6) -w, match whole words, e.g., searching for int will not match integer
Example: Search for lines containing the word int in the test.txt file
grep -w “int” test.txt
(7) Classic combination of find command and grep
Example: Find all .txt files in the current directory and subdirectories that contain the text pass test
find . -name “*.txt” | xargs grep -l “pass test”
(8) -E, enables extended regular expressions, this parameter will be explained in subsequent examples
4. Differences Between Basic and Extended Regular Expressions in grep
The main difference between basic and extended regular expressions lies in the handling of certain metacharacters: in basic regular expressions, some metacharacters need to be escaped to take effect, while in extended regular expressions, they can be used directly.
Metacharacters that can be used directly in basic regular expressions include
(1) ^, matches the start of a line, e.g., ^fine matches lines starting with “fine”
(2) $, matches the end of a line, e.g., fine$ matches lines ending with “fine”
(3) ., matches any single character (except newline)
(4) [], matches any single character within the brackets, e.g., [abc] matches a, b, or c
(5) [^], matches any character not in the brackets, e.g., [^0-9] matches non-digit characters
Metacharacters for repetition in basic regular expressions that need to be escaped
(1) \*, matches the preceding character or subexpression 0 or more times
e.g., b\* matches empty, a, aa, aaa, etc.
(2) \+, matches the preceding character or subexpression 1 or more times
e.g., b\+ matches a, aa, etc.
(3) \?, matches the preceding character or subexpression 0 or 1 time
e.g., b\? matches empty, b
(4) \{n\}, matches the preceding character or subexpression exactly n times
e.g., b\{3\} matches bbb
(5) \{n,\}, matches the preceding character or subexpression at least n times
e.g., b\{3,\} matches bbb, bbbb, etc.
(6) \{n,m\}, matches the preceding character or subexpression from n to m times
e.g., b\{2,3\} matches bb, bbb
Grouping and referencing in basic regular expressions that need to be escaped
(1) \(\), groups subexpressions
e.g., \(abc\)\*, where \* indicates the number of repetitions, matches empty, abc, abcabc, etc.
(2) \n, references the nth group, where n ranges from 1-9
e.g., (abc){2}dc\1 matches abcabcdcabc
Extended regular expressions allow the use of metacharacters such as ^, $, ., [], [^], *, +, ?, {n}, {n,}, {n,m}, (), | directly without escaping. Here, | represents logical OR, matching either expression on either side.
Example: Search for lines containing either error or warning in the test.txt file
grep -E “error|warning” test.txt