Regex › Regex DoS (ReDoS)

ReDoS and catastrophic backtracking

4 min read Advanced 3 sections

A regular expression can itself be a denial-of-service vulnerability. Certain patterns take exponential time on crafted inputs, freezing the thread that runs them. This lesson explains catastrophic backtracking, how to recognise the vulnerable shapes, and how to fix them — a genuine, reportable bug class.

You'll learn to

  • Explain backtracking and why it explodes
  • Recognise the vulnerable pattern shapes
  • Know the discovery and fix approaches

Why backtracking explodes

Most regex engines backtrack: when a match fails, they back up and try other ways to match. Usually fine — but certain patterns create an exponential number of ways to try, so on an input that almost matches, the engine tries them all before giving up.

Vulnerable:  (a+)+$        Input: "aaaaaaaaaaaaaaaaaaaaaa!"
The nested quantifier lets the engine split the a's in exponentially
many ways. With ~30 a's and a trailing non-match, it can run for
seconds; with more, effectively forever.

The trailing ! ensures the match ultimately fails, which forces the engine to exhaust every combination. The nested repeat — a quantified group that’s itself quantified — is what makes the number of combinations explode.

The vulnerable shapes

Watch for these patterns:
  (a+)+        (x+)*        (a|a)+        (a|ab)+      (.*a){n}
Common in the wild:
  ^(\w+\s?)*$            (validating words)
  ^(\d+)+$               (validating numbers)
  (email/URL regexes with nested quantifiers and alternation)

Checkpoint

Why does a pattern like (a+)+ run for an exponential amount of time on an input like many a's followed by a non-matching character?

Try it yourself

Identify which of these patterns are ReDoS-prone and why: a simple character class repeated once, versus a quantified group that is itself quantified, versus an alternation where both branches can match the same text. Then state two ways to make a vulnerable one safe.

Key takeaways

  • Backtracking engines can try exponentially many ways to match.
  • Nested quantifiers and ambiguous alternations cause catastrophic backtracking.
  • A short crafted input can hang the thread running the regex — a real DoS.
  • Fix by removing nesting, using atomic/possessive forms, or a non-backtracking engine (RE2).

Quick quiz

Next, Python regex for security automation — the re module applied across recon and log parsing.

Was this lesson helpful?