Overview

find, grep, awk, and sed are the four corners of Unix text processing. This card collects the expressions worth knowing, with the escaping and portability gotchas inline. For parameter expansion and pure-bash alternatives, see bash-one-liners.

find

Search the filesystem before you rename, delete, or pipe.

ExpressionWhat it does
find . -type f -name '*.md'All Markdown files, recursive.
find . -type d -name __pycache__Directories named __pycache__.
find . -type f -size +10MFiles larger than 10 MB.
find . -type f -mtime -7Files modified in the last 7 days.
find . -type f -newer ref.txtFiles newer than ref.txt.
find . -type f -name '*.log' -deleteDelete matched files in one pass.
find . -type f -name '*.tmp' -exec rm {} +Same, batched for fewer rm invocations.
find . -not -path '*/node_modules/*'Exclude a subtree.
find . -type f \( -name '*.ts' -o -name '*.tsx' \)Multiple name patterns with OR.
find . -type f -name '*.md' -print0 | xargs -0 wc -lSafe pipeline for filenames with spaces.
find . -mindepth 2 -maxdepth 3 -type fDepth-bounded search.
find . -type d -empty -deleteRemove empty directories.

Use -print0 | xargs -0 when filenames may contain spaces, newlines, or quotes.

grep

Match patterns in files; use the right engine for the job.

Flag or expressionWhat it does
grep "pattern" file.txtBasic search; print matching lines.
grep -r "pattern" .Recursive search from current directory.
grep -rn "pattern" .Recursive with line numbers.
grep -rI "pattern" .Skip binary files.
grep -l "pattern" ./*.mdPrint only filenames, not matching lines.
grep -L "pattern" ./*.mdFiles that do NOT match.
grep -c "pattern" fileCount matching lines.
grep -v "pattern" fileInvert match; print non-matching lines.
grep -i "pattern" fileCase-insensitive.
grep -w "word" fileWhole-word match.
grep -A 3 -B 2 "pattern" file3 lines after, 2 lines before each match.
grep -E "(foo|bar)" fileExtended regex; equivalent to egrep.
grep -P "\d{3}-\d{4}" filePCRE (not portable; GNU grep only).
grep -o "pattern" filePrint only the matched portion.

Prefer rg (ripgrep) over grep -r for speed and .gitignore awareness.

awk

awk is a field-oriented language; use it for columnar data.

ExpressionWhat it does
awk '{print $1}' filePrint first field (default delimiter: whitespace).
awk '{print $NF}' filePrint last field.
awk -F: '{print $1}' /etc/passwdSet field delimiter to :.
awk 'NR==5' filePrint line 5.
awk 'NR>=3 && NR<=7' filePrint lines 3 to 7.
awk '/pattern/' filePrint lines matching a regex.
awk '!/pattern/' filePrint lines not matching.
awk '{sum += $2} END {print sum}' fileSum column 2.
awk 'BEGIN {FS=","} {print $1, $3}' file.csvCSV with explicit FS.
awk '{print $2, $1}' fileSwap columns 1 and 2.
awk 'length($0) > 80' fileLines longer than 80 characters.
awk '!seen[$0]++' fileRemove duplicate lines (preserving order).
awk '{gsub(/old/, "new"); print}' fileGlobal substitution in each line.

awk is ideal for log parsing and CSV reshaping when you need field access rather than full regex substitution.

sed

sed edits streams line by line; use it for substitution, deletion, and insertion.

ExpressionWhat it does
sed 's/old/new/' fileReplace first occurrence per line.
sed 's/old/new/g' fileReplace all occurrences per line.
sed -i 's/old/new/g' fileIn-place edit (GNU sed; Linux).
sed -i '' 's/old/new/g' fileIn-place edit (BSD sed; macOS).
sed -i.bak 's/old/new/g' fileIn-place with backup; portable.
sed '/pattern/d' fileDelete lines matching a pattern.
sed -n '/start/,/end/p' filePrint lines between two patterns.
sed -n '10,20p' filePrint lines 10 to 20.
sed '1d' fileDelete the first line (header removal).
sed '$d' fileDelete the last line.
sed 's/^/PREFIX: /' filePrepend to every line.
sed 's/$/ SUFFIX/' fileAppend to every line.
sed '/pattern/a\new line' fileAppend a line after each match (GNU sed).

Common pipelines

Combine the tools for real-world tasks.

PipelineWhat it does
find . -name '*.log' | xargs grep -l "ERROR"Find log files that contain “ERROR”.
grep -rn "TODO" . | awk -F: '{print $1}' | sort -uFiles containing TODOs, deduplicated.
find . -name '*.md' -exec grep -l "deprecated" {} +Markdown files with the word “deprecated”.
grep -r "old_function" . --include='*.py' -l | xargs sed -i 's/old_function/new_function/g'Rename a function across Python files.
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10Top 10 IPs in an access log.
sed -n '/\[ERROR\]/,/\[INFO\]/p' app.logExtract error blocks.
find . -name '*.ts' -print0 | xargs -0 grep -l "console.log" | xargs sed -i '/console\.log/d'Remove console.log from TypeScript files.

Common gotchas

  • sed -i is GNU; sed -i '' is BSD. Use sed -i.bak for portable in-place editing; then remove *.bak with find . -name '*.bak' -delete.
  • find . -name '*.md' -exec cmd {} ; runs cmd once per file. Use + instead of ; to batch files and reduce subprocess overhead.
  • Unquoted globs in -name are expanded by the shell before find sees them. Always quote the pattern: -name '*.md', not -name *.md.
  • grep -r follows symlinks and enters node_modules unless you pass --exclude-dir=node_modules. Use rg to avoid this.
  • awk '{print $1}' uses whitespace as the delimiter. A CSV with spaces inside quoted fields will be split incorrectly. Set -F"," and handle quotes explicitly or use a proper CSV tool.
  • sed 's/./X/g' replaces every character; . is a regex metacharacter. Escape it as \. to match a literal dot.
  • grep -P (PCRE) is available in GNU grep but not in BSD grep. Use -E for ERE patterns to stay portable.