Regex › WAF Analysis and Filter Bypasses
Reading WAF rules to find the gaps
A Web Application Firewall is, at heart, a big collection of regex rules that block known attack patterns. Because they’re blocklists, they have gaps — and reading the rules (or inferring them by probing) reveals exactly where. This lesson is about thinking like the rule author to find what they forgot.
You'll learn to
- Understand how WAF regex rules work
- Infer rules by probing
- Find the gap a rule leaves open
What a WAF rule looks like
WAF rules (like ModSecurity’s) are regex patterns that match malicious input and block the request. For example, a SQLi rule might look for union-select sequences:
Simplified WAF rules:
SQLi: (?i)union.*select (?i)or\s+1\s*=\s*1
XSS: (?i)<script (?i)on\w+\s*=
LFI: \.\./ etc/passwd
Each rule blocks a specific shape of attack. And each rule’s exact wording is also its weakness: whatever the pattern doesn’t cover, gets through.
Finding the gap
Rule: (?i)union.*select (blocks 'union select')
Gaps to probe:
union/**/select (comment breaks the .* if it stops at whitespace?)
union%0aselect (newline — does . match it? often not!)
UNIunionON SELselectECT (nested, if the WAF doesn't recompile)
union all select (does .* between still match? test it)
Checkpoint
Why can a WAF rule like union.*select often be bypassed by inserting a newline between the two keywords?
Because in most regex flavours the dot does not match a newline character by default. The rule's .* matches any characters except a line break, so when an attacker puts a newline between 'union' and 'select', the pattern can't span across it and fails to match — the malicious input passes through. The fix on the defensive side is to normalise input (collapse or strip newlines) before matching and to use flags that let the pattern cross line breaks.
Try it yourself
Given a WAF rule that blocks the case-insensitive sequence of ‘union’ followed by anything followed by ‘select’, list four payload variations you’d try to bypass it — thinking about newlines, comments, encoding, and case. Then explain, for the defender, how you’d rewrite the rule to close those gaps.
Key takeaways
- WAFs are regex blocklists; each rule blocks one specific attack shape.
- A rule’s wording is its weakness — what it doesn’t match gets through.
- The dot usually doesn’t match newlines, a classic bypass for sequence rules.
- Defensively: normalise and decode input before matching; don’t rely on the WAF alone.
Quick quiz
Next, the bypass techniques themselves — the filter-versus-interpreter gap that defeats validation.