Regex › Bash, grep, sed, awk Regex
Command-line regex: BRE, ERE, and PCRE
On the command line, the same pattern can work in one tool and fail in another, because grep, sed, and awk use different regex dialects. Knowing the three — BRE, ERE, and PCRE — is what saves you from the maddening ‘my regex works in Python but not in grep’ problem.
You'll learn to
- Tell BRE, ERE, and PCRE apart
- Choose the right flag for each tool
- Write portable command-line patterns
The three dialects
BRE (Basic): grep, sed by default
+ ? | ( ) { } are LITERAL unless backslash-escaped: \( \) \{ \}
ERE (Extended): grep -E, sed -E, awk
+ ? | ( ) { } work as metacharacters directly
PCRE (Perl): grep -P (where available)
full features: \d \w lookarounds, like Python/JS
The big gotcha is BRE: in basic mode, +, ?, |, (), and {} are literal characters, and you must backslash-escape them to get their special meaning. In ERE (-E), they work directly. PCRE (-P) adds the full Perl features like \d and lookarounds.
The same pattern, three ways
# Match one or more digits:
grep '[0-9][0-9]*' file # BRE: no + available, so repeat manually
grep -E '[0-9]+' file # ERE: + works
grep -P '\d+' file # PCRE: \d works
# Alternation:
grep -E 'cat|dog' file # ERE
grep 'cat\|dog' file # BRE: escape the pipe
Checkpoint
Why does grep '[0-9]+' (without -E) often return nothing, while grep -E '[0-9]+' works?
Plain grep uses Basic Regular Expressions (BRE), where the plus sign is a literal character, not a quantifier. So grep '[0-9]+' looks for a digit followed by an actual '+' symbol, which rarely appears in the text, returning nothing. The -E flag switches to Extended Regular Expressions (ERE), where + means 'one or more' as expected, so grep -E '[0-9]+' correctly matches runs of digits. The dialect difference is the cause.
Try it yourself
Write a pattern to match one or more digits three ways: in BRE (plain grep), in ERE (grep -E), and in PCRE (grep -P). Then write an alternation matching ‘cat’ or ‘dog’ in both BRE and ERE, noting where you must escape the pipe.
Key takeaways
- BRE (plain grep/sed): + ? | ( ) are literal unless escaped.
- ERE (grep -E, awk): those metacharacters work directly.
- PCRE (grep -P): full Perl features like \d and lookarounds.
- Default to -E; suspect the dialect when a pattern matches nothing.
Quick quiz
Next, Go regex — the RE2 engine and its linear-time guarantee for security tooling.