Python › Web Application Security Automation
Building a custom web scanner
A scanner sounds complex, but it’s a simple idea: send crafted requests and notice when a response differs from what you’d expect. The skill isn’t the loop — it’s choosing the difference that signals a bug. This lesson builds a focused scanner from the requests skills you already have.
You'll learn to
- Structure a scanner as a differential detector
- Spot reflection, error, and timing signals
- Keep it robust across many targets
A scanner is a differential detector
Every useful scanner asks: does this response differ from the baseline in a way that suggests a vulnerability? Three classic differences:
import requests, time
MARKER = "zq7x9k" # unique, unlikely to occur naturally
def test_param(session, url, param):
# 1) Reflection (precondition for XSS): does our marker come back?
r = session.get(url, params={param: MARKER}, timeout=10)
if MARKER in r.text:
print(f"[reflect] {url} param={param} reflects input")
# 2) Error/behaviour differential (injection hint):
base = session.get(url, params={param: "1"}, timeout=10)
quote = session.get(url, params={param: "1'"}, timeout=10)
if base.status_code != quote.status_code or \
abs(len(base.text) - len(quote.text)) > 100:
print(f"[diff] {url} param={param} behaves differently with a quote")
The reflection check injects a unique marker and looks for it echoed back. The differential check compares a benign value against one with a quote — a status or large length change hints at injection-sensitive processing. These are signals to investigate, not proof.
Robustness for real targets
from requests.adapters import HTTPAdapter, Retry
def make_session():
s = requests.Session()
s.headers.update({"User-Agent": "scanner/1.0"})
retry = Retry(total=2, backoff_factor=0.3, status_forcelist=[429,500,502,503,504])
s.mount("https://", HTTPAdapter(max_retries=retry))
return s
Build one robust session — retries on transient failures, a consistent user-agent, a timeout on every request — and reuse it. Add s.proxies pointing at Burp and your scanner’s traffic flows through Burp for inspection.
Checkpoint
What makes a custom scanner often more effective than a generic off-the-shelf one?
A custom scanner encodes your specific hypothesis about a particular target and tests it precisely across every input, then lets you triage the hits. A generic scanner tests many things shallowly and produces lots of noise. The custom one finds the bug you suspected because it was built to look for exactly that.
Try it yourself
Write a function that takes a session, a URL, and a parameter name, injects a unique marker via that parameter, and reports whether the marker is reflected in the response. Run it against an authorised test app across a few parameters.
Key takeaways
- A scanner detects differences from a baseline that signal bugs.
- Reflection, status/length differentials, and timing are the core signals.
- A match is a lead to verify, never a confirmed finding.
- Build one robust session (retries, timeout, proxy) and reuse it.
Quick quiz
Next, automating source code review to find hardcoded secrets and dangerous functions across a whole codebase.