Regex › Regex DoS (ReDoS)
ReDoS and catastrophic backtracking
A regular expression can itself be a denial-of-service vulnerability. Certain patterns take exponential time on crafted inputs, freezing the thread that runs them. This lesson explains catastrophic backtracking, how to recognise the vulnerable shapes, and how to fix them — a genuine, reportable bug class.
You'll learn to
- Explain backtracking and why it explodes
- Recognise the vulnerable pattern shapes
- Know the discovery and fix approaches
Why backtracking explodes
Most regex engines backtrack: when a match fails, they back up and try other ways to match. Usually fine — but certain patterns create an exponential number of ways to try, so on an input that almost matches, the engine tries them all before giving up.
Vulnerable: (a+)+$ Input: "aaaaaaaaaaaaaaaaaaaaaa!"
The nested quantifier lets the engine split the a's in exponentially
many ways. With ~30 a's and a trailing non-match, it can run for
seconds; with more, effectively forever.
The trailing ! ensures the match ultimately fails, which forces the engine to exhaust every combination. The nested repeat — a quantified group that’s itself quantified — is what makes the number of combinations explode.
The vulnerable shapes
Watch for these patterns:
(a+)+ (x+)* (a|a)+ (a|ab)+ (.*a){n}
Common in the wild:
^(\w+\s?)*$ (validating words)
^(\d+)+$ (validating numbers)
(email/URL regexes with nested quantifiers and alternation)
Checkpoint
Why does a pattern like (a+)+ run for an exponential amount of time on an input like many a's followed by a non-matching character?
The nested quantifier — a repeat (the inner a+) inside another repeat (the outer +) — means the engine can divide the run of a's among the groups in exponentially many ways. Because the trailing non-matching character makes the overall match fail, the engine backtracks and tries every one of those combinations before concluding it can't match. As the number of a's grows, the number of combinations grows exponentially, so even a short input can take seconds or minutes — that's catastrophic backtracking, the basis of ReDoS.
Try it yourself
Identify which of these patterns are ReDoS-prone and why: a simple character class repeated once, versus a quantified group that is itself quantified, versus an alternation where both branches can match the same text. Then state two ways to make a vulnerable one safe.
Key takeaways
- Backtracking engines can try exponentially many ways to match.
- Nested quantifiers and ambiguous alternations cause catastrophic backtracking.
- A short crafted input can hang the thread running the regex — a real DoS.
- Fix by removing nesting, using atomic/possessive forms, or a non-backtracking engine (RE2).
Quick quiz
Next, Python regex for security automation — the re module applied across recon and log parsing.