In Linux, the grep command is used for text searching. Whether processing logs, filtering files, or finding specific strings in a code repository, grep can perform remarkably well.
1. Basic Syntax
The basic format of the grep command is:<span>grep [options] 'search pattern' [file]</span>
. For example, to search for the word “linux” in the <span>run.log</span>
file, you can execute the command:<span>grep 'linux' run.log</span>
.
2. Common Options
-
-i: Ignore case: When searching, it does not distinguish between uppercase and lowercase letters, so any target string, whether uppercase, lowercase, or mixed case, can be matched.
-
-v: Invert match: Outputs lines that do not match the specified search pattern, often used to filter out content that does not contain specific information.
-
-n: Show line numbers: While outputting matching lines, it also shows the line number in the file, making it easy to quickly locate the target content.
-
-r: Recursive search: Used for recursively searching files in the specified directory and all its subdirectories, allowing traversal of the entire directory tree to find the target string.
-
-l: Show only filenames: Only outputs the filenames that contain matching content, without displaying the specific matching lines, suitable for quickly understanding which files contain the target information.
-
-w: Match whole words: Ensures that only complete words are matched, not parts of words, to avoid false matches.
-
-c: Count matching lines: Outputs the number of lines containing matching content, instead of the specific matching lines, often used to quickly understand the frequency of the target string in the file. Running
<span>grep -c 'error' run.log</span>
, will return<span>run.log</span>
the number of lines containing “error”, such as “25”, indicating that 25 lines contain that string. -
-o: Show only matching parts: Only outputs the matched string parts, not the entire line, making it easy to extract specific information. If the file contains “ID: 12345”, executing
<span>grep -o '[0-9]' run.log</span>
, will only output the matched numeric characters “1”, “2”, “3”, “4”, “5”. -
-A num: Displays the matching line and the following (-A means after) num lines, which helps to view the context of the matching content.
-
-B num: Displays the matching line and the preceding (-B means before) num lines, which is the opposite of
<span>-A</span>
, it outputs the matching line and the num lines of content before it, also used to view the context of the matching content. -
-C num: Displays the matching line and num lines before and after it (-C means context), combining the functions of
<span>-A</span>
and<span>-B</span>
, outputs the matching line along with num lines before and after it, providing more comprehensive context information.
3. Enhanced by Regular Expressions
3.1 Basic Regular Expressions
-
^: Matches the start of a line
-
Explanation: Matches the beginning position of the line; only when the target string appears at the start of the line will it be matched.
-
Example: Execute
<span>grep '^begin' run.log</span>
, it will find the lines in<span>run.log</span>
that start with “begin”.
$: Matches the end of a line
-
Explanation: Matches the end position of the line; only when the target string appears at the end of the line will it be matched.
-
Example: Running
<span>grep 'end$' run.log</span>
, can find the lines that end with “end”.
.: Matches any single character
-
Explanation: Can match any single character, including letters, numbers, symbols, etc.
-
Example: Execute
<span>grep 'data_in._vld' run.log</span>
, will match lines like “data_in1_vld”, “data_int_vld”, “data_inp_vld”, etc.
*: Matches the previous character zero or more times
-
Explanation: Allows the previous character to appear any number of times, including zero.
-
Example: Running
<span>grep 'go*gle' run.log</span>
, can match “gle”, “gogle”, “goooogle”, etc.
[ ]: Matches a set of characters
-
Explanation: Matches any one character within the brackets.
-
Example: Execute
<span>grep 'data_[abc]' run.log</span>
, will match “data_a”, “data_b”, “data_c” signals, but will not match “data_d”.
[^]: Matches characters not in the set
-
Explanation: Opposite to
<span>[]</span>
, matches any character not in the character set within the brackets. -
Example: Running
<span>grep 'data_[^a-c]' run.log</span>
, will find other “data_” signals except for “data_a”, “data_b”, “data_c”.
3.2 Extended Regular Expressions
-E indicates using extended regular expressions for pattern matching.
-
+: Matches the previous character one or more times
-
Explanation: Requires the previous character to appear at least once, similar to
<span>*</span>
but does not include the case of zero appearances. -
Example: Execute
<span>grep -E 'go+gle' run.log</span>
(note to add<span>-E</span>
option to enable extended regex), will match “gogle”, “gooogle”, etc., but not “gle”.
?: Matches the previous character zero or one time
-
Explanation: The previous character either appears once or does not appear, i.e., the occurrence count is 0 or 1.
-
Example: Running
<span>grep -E 'colou?r' run.log</span>
, can match “color” and “colour”.
() : Grouping
-
Explanation: Treats the content within the parentheses as a whole, making it easier to apply the same operation or limitation to a group of characters.
-
Example: Execute
<span>grep -E '(red|blue) car' run.log</span>
, will match lines with “red car” and “blue car”.
{n,m}: Specify the range of occurrences
-
Explanation: Indicates that the previous character appears between n and m times (including n and m).
-
Example: Running
<span>grep -E 'a{2,4}' run.log</span>
, will match lines containing “aa”, “aaa”, “aaaa”.