Commonly Used grep Commands in Linux/Shell Scripts

1. Introduction to the grep Command

grep is a powerful command used for text searching in Linux/Unix systems. It searches for matching content in files or input streams based on specified patterns (regular expressions or strings) and outputs the lines containing the matches.

2. Basic Syntax of the grep Command

grep [options] pattern [file..]

pattern: a regular expression or string (basic regular expressions are supported by default; use -E to enable extended regular expressions).

file: specifies the files to search. If not specified, it generally reads from standard input through a pipe.

3. Commonly Used Parameters and Examples of the grep Command

(1) -i, ignore case

Example: Search for lines containing the word error in the test.txt file, regardless of case

grep -i “error” test.txt

(2) -v, reverse match, only output lines that do not contain the pattern

Example: Search for lines in the test.txt file that do not contain success

grep -v “success” test.txt

(3) -n, display the line numbers of matching lines

Example: Search for lines containing error in the test.txt file and display the line numbers

grep -n “error” test.txt

(4) -r, recursively search all files in the specified directory and its subdirectories

Example: Search for lines containing error in files under /tmp and all its subdirectories

grep -r “error” /tmp

(5) -l (lowercase, distinct from uppercase L), only output the names of files that contain matches (not the specific lines)

Example: Return the names of files in the current directory that contain error

grep -l “error” ./*

The result will be the current directory + filename

(6) -w, match whole words, e.g., searching for int will not match integer

Example: Search for lines containing the word int in the test.txt file

grep -w “int” test.txt

(7) Classic combination of find command and grep

Example: Find all .txt files in the current directory and subdirectories that contain the text pass test

find . -name “*.txt” | xargs grep -l “pass test”

(8) -E, enables extended regular expressions, this parameter will be explained in subsequent examples

4. Differences Between Basic and Extended Regular Expressions in grep

The main difference between basic and extended regular expressions lies in the handling of certain metacharacters: in basic regular expressions, some metacharacters need to be escaped to take effect, while in extended regular expressions, they can be used directly.

Metacharacters that can be used directly in basic regular expressions include

(1) ^, matches the start of a line, e.g., ^fine matches lines starting with “fine”

(2) $, matches the end of a line, e.g., fine$ matches lines ending with “fine”

(3) ., matches any single character (except newline)

(4) [], matches any single character within the brackets, e.g., [abc] matches a, b, or c

(5) [^], matches any character not in the brackets, e.g., [^0-9] matches non-digit characters

Metacharacters for repetition in basic regular expressions that need to be escaped

(1) \*, matches the preceding character or subexpression 0 or more times

e.g., b\* matches empty, a, aa, aaa, etc.

(2) \+, matches the preceding character or subexpression 1 or more times

e.g., b\+ matches a, aa, etc.

(3) \?, matches the preceding character or subexpression 0 or 1 time

e.g., b\? matches empty, b

(4) \{n\}, matches the preceding character or subexpression exactly n times

e.g., b\{3\} matches bbb

(5) \{n,\}, matches the preceding character or subexpression at least n times

e.g., b\{3,\} matches bbb, bbbb, etc.

(6) \{n,m\}, matches the preceding character or subexpression from n to m times

e.g., b\{2,3\} matches bb, bbb

Grouping and referencing in basic regular expressions that need to be escaped

(1) \(\), groups subexpressions

e.g., \(abc\)\*, where \* indicates the number of repetitions, matches empty, abc, abcabc, etc.

(2) \n, references the nth group, where n ranges from 1-9

e.g., (abc){2}dc\1 matches abcabcdcabc

Extended regular expressions allow the use of metacharacters such as ^, $, ., [], [^], *, +, ?, {n}, {n,}, {n,m}, (), | directly without escaping. Here, | represents logical OR, matching either expression on either side.

Example: Search for lines containing either error or warning in the test.txt file

grep -E “error|warning” test.txt

Leave a Comment