Regex › Regex Engine Internals

Regex engine internals: NFA, DFA, and backtracking

4 min read Advanced 3 sections

To truly understand regex performance and ReDoS, you need to know how engines work underneath. There are two fundamental approaches — backtracking and automaton-based — and the choice explains everything from why some features exist to why some patterns explode. This lesson is the theory that makes the rest click.

You'll learn to

  • Distinguish backtracking from automaton engines
  • See why each has its trade-offs
  • Connect internals to ReDoS and RE2

Two kinds of engine

Backtracking (NFA-based):  PCRE, Python re, JS, Java, .NET, Perl
  - tries one path, backs up on failure, tries another
  - SUPPORTS backreferences and lookarounds
  - CAN explode exponentially (catastrophic backtracking / ReDoS)

Automaton (DFA-based):     RE2 (Go), Rust regex, grep (often)
  - tracks all possible matches simultaneously
  - LINEAR time always, no catastrophic backtracking
  - CANNOT do backreferences or lookarounds

The split is fundamental. Backtracking engines explore match paths one at a time and retreat when stuck — flexible (they support backreferences and lookarounds) but vulnerable to exponential blowup. Automaton engines track all possibilities at once — always linear-time, but unable to support features that need backtracking.

Why this explains everything

Checkpoint

Why can a backtracking engine suffer catastrophic backtracking while an automaton engine like RE2 cannot?

Try it yourself

Explain in your own words why backreferences exist only in backtracking engines and not in RE2, and why ReDoS is impossible in an automaton engine. Then state, for a server-side validator over untrusted input, which engine type you’d choose and what you’d give up.

Key takeaways

  • Backtracking (NFA) engines try paths and retreat — flexible but can explode.
  • Automaton (DFA) engines track all states at once — linear time, fewer features.
  • Backreferences and lookarounds need backtracking; RE2 lacks them by design.
  • Engine choice is a security decision: guaranteed performance vs full features.

Quick quiz

Finally, the expert offensive synthesis — how elite practitioners wield regex across every security role.

Was this lesson helpful?