Regex › Regex Fundamentals
What regex is, and the mindset that matters
A regular expression — regex — is a pattern that describes a set of strings. Instead of searching for one exact word, you describe the shape of what you’re looking for: “an AWS key looks like AKIA followed by sixteen uppercase letters or digits.” Regex is how you find secrets, endpoints, and tokens in a sea of text, and how you read the validation filters that try to stop you.
Here’s the framing to carry through this whole course: regex is a search, extraction, and control-analysis tool. You will read and weaponise far more patterns than you write. Keep that in mind and the priorities fall into place.
You'll learn to
- Explain what regex is and why it exists
- Tell literals apart from metacharacters
- Read and write your first real pattern
Literals vs metacharacters
Every character in a pattern is one of two things: a literal that matches itself, or a metacharacter that has special meaning.
cat matches the literal text "cat"
c.t . is a metacharacter meaning "any single character"
→ matches "cat", "cot", "cut", "c9t" ...
In the first pattern, every character is a literal — it matches exactly “cat.” In the second, the . is a metacharacter that means “any one character.” That single distinction is the entire foundation of regex. Learning regex is mostly learning what each metacharacter does.
Character classes — your first real power
Square brackets [ ] define a character class: “match any one of these characters.”
[aeiou] any one vowel
[0-9] any one digit (the dash means a range)
[a-z] any one lowercase letter
[A-Z0-9] any one uppercase letter OR digit
[^0-9] any character that is NOT a digit (^ inside [] means 'not')
Now combine that with a quantifier that says how many times to match:
[0-9]+ one or more digits (+ means "one or more")
[0-9]* zero or more digits (* means "zero or more")
[0-9]{16} exactly 16 digits ({n} means "exactly n")
[0-9]{8,} 8 or more digits ({n,} means "n or more")
Put those together and you can already describe real things:
AKIA[0-9A-Z]{16}
That reads: the literal text AKIA, followed by exactly sixteen characters that are each a digit or uppercase letter. That’s the shape of an AWS access key — a genuinely useful pattern, built from ideas you just learned.
Why “read more than you write” matters
Most of the regex you encounter in security work is already written — it’s the validation rule in an app’s source, the WAF signature blocking your payload, the detection rule in a SIEM. Your job is often to read that pattern and find what it doesn’t cover.
Checkpoint
A pattern matches three uppercase letters followed by four digits. Write that pattern, and give an example string it matches.
The pattern is [A-Z] with quantifier 3, then [0-9] with quantifier 4. It matches strings like ABC1234 or XYZ0001 — three letters, four digits. It's the kind of pattern you'd use for a structured ID or reference code.
Try it yourself
Write a pattern that matches a US-style ZIP code: exactly five digits. Then extend it to optionally allow the “+4” form (five digits, a dash, four digits). Hint: you’ll need {5}, and ? for the optional part. Test it mentally against “90210” and “90210-1234”.
Summary
Regex describes the shape of text rather than exact words. Every character is either a literal (matches itself) or a metacharacter (has special meaning). Character classes [ ] match one of a set; quantifiers (+, *, {n}) say how many times. Together they build real patterns like AKIA[0-9A-Z]{16}. Above all, you’ll read more patterns than you write — and reading a filter to find its gaps is the core offensive skill.
Key takeaways
- A literal matches itself; a metacharacter has special meaning.
[ ]is a character class;[0-9],[a-z],[^...]are the common forms.- Quantifiers:
+(one or more),*(zero or more),{n}(exactly n). - Reading a filter to find what it missed is the highest-value regex skill.
Quick quiz
Next, anchors and boundaries — how you pin a pattern to the start or end of a line, which is what makes validation (and validation bypasses) work.