Regex › Language-Specific Implementations

Cross-language regex quirks

3 min read Advanced 3 sections

When reviewing source in many languages, you’ll meet their regex differences — and some of those differences are security-relevant. This lesson covers the cross-language quirks that matter: dangerous functions, anchor behaviour, and the features that turn a regex into a vulnerability.

You'll learn to

Know the dangerous regex functions per language
Watch anchor and multiline differences
Spot regex-driven vulnerabilities in source

Dangerous regex functions

PHP:    preg_replace with the /e modifier (old) = code execution!
        preg_match without anchors = validation bypass
Perl:   regex with (?{ ... }) embedded code = code execution
Java:   Pattern with catastrophic backtracking = ReDoS (very common)
Ruby:   ^ and $ match line boundaries, NOT string -> bypass with newlines
.NET:   similar multiline anchor behaviour to watch

Some languages have regex features that execute code — PHP’s old preg_replace /e modifier and Perl’s embedded-code constructs are direct code-execution sinks when fed attacker input. These are high-severity finds in source review.

The Ruby/multiline anchor trap

In Ruby, ^ and $ anchor to LINE boundaries, not the whole string.
A validator:  /^[a-z]+$/  in Ruby
Bypass:       "safe\nmalicious"  -> ^ and $ match the 'safe' line,
              the malicious second line passes validation!
The string-anchoring equivalents are \A and \z.

Checkpoint

Why is a /^[a-z]+$/ validator a potential bypass in Ruby specifically, and what's the fix?

Try it yourself

Explain why the same caret-dollar validation pattern is whole-string in Python but line-anchored in Ruby, and what a multiline bypass against the Ruby version looks like. Then name the two anchors Ruby uses for true string boundaries.

Key takeaways

Some languages’ regex can execute code (PHP /e modifier, Perl embedded code).
Ruby’s ^ and $ anchor to lines, not the string — a multiline bypass risk.
Use \A and \z for true string anchoring in Ruby.
Don’t assume a pattern safe in one language behaves the same in another.

Quick quiz

Next, regex engine internals — how NFA, DFA, and backtracking actually work under the hood.

Was this lesson helpful?