Grep helps you locate any given pattern(s) within one or more files.
Very useful when parsing logs!
Keep in mind
Not all grep implementations are created equal. This post references the GNU implementation.
The basics
Grep commands have the following structure:
grep [OPTIONS] 'this_string' that_file
This will output the full line(s) where this_string
was found, highlighting the match itself.
Context
There is a -n
flag you can use to get the line Numbers of the matches.
You might find it useful to have some more Context around your grep results.
Use something like -C2
to tell grep to also print the two lines before and after each match.
Keep in mind that the amount of context lines printed will be limited by other matches as well as the beginning and end of the file, so you might not always get exactly the amount of lines you asked for.
These two flags work well together, since grep will separate line numbers from the line itself using :
for matching lines and -
for context lines.
So given a grepme
file like so:
grep -nC2 '3183_22' grepme
will output
Multiple files
You can use dir/*
instead of a file name to tell grep to look in all files in dir/
(or simply *
to look in all files under cwd
).
If there are any directories here, it will print errors since it can’t do much with them.
To Suppress these errors, use the -s
flag.
Quality of life
Count
More often than not you’ll need the number of matching lines, more so than the lines themselves.
You might be tempted to pipe grep into wc -l
, but there are better options.
grep 'hi there!' file | wc -l
and grep -c 'hi there!' file
produce the same output: They both Count the number of matching lines.
Or, use the pipe with the -o
flag to get the number of Ocurrances (which will differ from -c
if there are more than one match per line).
So following the previous example:
grep -c '3183_22' grepme
➡️ 2
grep -o '3183_22' grepme | wc -l
➡️ 4
-o
on its own will simply print the matches themselves, which doesn’t make much sense right now, but will once you add regular expressions to the mix.
The classics
There are some combinations that are used so often you might as well create an alias for them.
The first command will output all lines plus lines Numbers (-n
) NOT matching foo
(-v
). It will look for the match recursively (-r
) with case Insensitivity (-i
).
The second one will output all files containing a match (-l
, -L
would output only files NOT containing a match) for bar
, recursively (-r
).
The not so basics
Multiple searches
Just like sed, you can use -e
to concatenate multiple searches in the same grep command.
Using sed, this flag runs the all commands on each line. Similarly, here it will print out all lines that match any of the expressions.
This might be surprising, since when piping grep commands into each other, the result will be the exact opposite: you will get only lines that match all the expressions.
So again, using the example file from before:
grep -e '3183' -e '22' grepme | wc -l
➡️ 4
grep '3183' grepme | grep '22' | wc -l
➡️ 2
Here I use | wc -l
instead of -c
for clarity/symmetry.
Regex
Again, just like sed and find, grep uses reduced regex by default and the -E
flag allows you to use its full regex engine.
If instead you want to avoid regex altogether and look for a literal string with strange characters, use -F
.
grep -F '[Hh]ello moto*' file
will literally match “[Hh]ello moto*”. Not “Hello moto”, not “hello moto”, and not “[Hh]ello moto, something else”.
Exclude and include
You can exclude and include files from the search by a given pattern.
Even better, you can use both flags together to fine tune where you are searching exactly.
grep -s --exclude=*.py --include=main.py 'something' *
Will exclude all Python files from the search, except for main.py
.
Grep based on a file
Say you have a list of blacklisted words you want to ensure are not present in a project.
grep -f blacklist.words projectFile
will print out all matches for any of the lines in blacklist.words
, while also passing it the -l
flag from before will print only the problematic filenames.
For this to work, blacklist.words
has to contain one expression (or word) per line.
Another neat use case:
ls | grep -f blacklist.files
This will output all filenames in cwd
listed in blacklist.files
.