Regex › Regex Fundamentals

What regex is, and the mindset that matters

5 min read Beginner 5 sections

A regular expression — regex — is a pattern that describes a set of strings. Instead of searching for one exact word, you describe the shape of what you’re looking for: “an AWS key looks like AKIA followed by sixteen uppercase letters or digits.” Regex is how you find secrets, endpoints, and tokens in a sea of text, and how you read the validation filters that try to stop you.

Here’s the framing to carry through this whole course: regex is a search, extraction, and control-analysis tool. You will read and weaponise far more patterns than you write. Keep that in mind and the priorities fall into place.

You'll learn to

  • Explain what regex is and why it exists
  • Tell literals apart from metacharacters
  • Read and write your first real pattern

Literals vs metacharacters

Every character in a pattern is one of two things: a literal that matches itself, or a metacharacter that has special meaning.

cat        matches the literal text "cat"
c.t        . is a metacharacter meaning "any single character"
           → matches "cat", "cot", "cut", "c9t" ...

In the first pattern, every character is a literal — it matches exactly “cat.” In the second, the . is a metacharacter that means “any one character.” That single distinction is the entire foundation of regex. Learning regex is mostly learning what each metacharacter does.

Character classes — your first real power

Square brackets [ ] define a character class: “match any one of these characters.”

[aeiou]      any one vowel
[0-9]        any one digit (the dash means a range)
[a-z]        any one lowercase letter
[A-Z0-9]     any one uppercase letter OR digit
[^0-9]       any character that is NOT a digit (^ inside [] means 'not')

Now combine that with a quantifier that says how many times to match:

[0-9]+       one or more digits      (+ means "one or more")
[0-9]*       zero or more digits     (* means "zero or more")
[0-9]{16}    exactly 16 digits       ({n} means "exactly n")
[0-9]{8,}    8 or more digits        ({n,} means "n or more")

Put those together and you can already describe real things:

AKIA[0-9A-Z]{16}

That reads: the literal text AKIA, followed by exactly sixteen characters that are each a digit or uppercase letter. That’s the shape of an AWS access key — a genuinely useful pattern, built from ideas you just learned.

Why “read more than you write” matters

Most of the regex you encounter in security work is already written — it’s the validation rule in an app’s source, the WAF signature blocking your payload, the detection rule in a SIEM. Your job is often to read that pattern and find what it doesn’t cover.

Checkpoint

A pattern matches three uppercase letters followed by four digits. Write that pattern, and give an example string it matches.

Try it yourself

Write a pattern that matches a US-style ZIP code: exactly five digits. Then extend it to optionally allow the “+4” form (five digits, a dash, four digits). Hint: you’ll need {5}, and ? for the optional part. Test it mentally against “90210” and “90210-1234”.

Summary

Regex describes the shape of text rather than exact words. Every character is either a literal (matches itself) or a metacharacter (has special meaning). Character classes [ ] match one of a set; quantifiers (+, *, {n}) say how many times. Together they build real patterns like AKIA[0-9A-Z]{16}. Above all, you’ll read more patterns than you write — and reading a filter to find its gaps is the core offensive skill.

Key takeaways

  • A literal matches itself; a metacharacter has special meaning.
  • [ ] is a character class; [0-9], [a-z], [^...] are the common forms.
  • Quantifiers: + (one or more), * (zero or more), {n} (exactly n).
  • Reading a filter to find what it missed is the highest-value regex skill.

Quick quiz

Next, anchors and boundaries — how you pin a pattern to the start or end of a line, which is what makes validation (and validation bypasses) work.

Was this lesson helpful?