Python › Web Application Security Automation

Building a custom web scanner

3 min read Intermediate 3 sections

A scanner sounds complex, but it’s a simple idea: send crafted requests and notice when a response differs from what you’d expect. The skill isn’t the loop — it’s choosing the difference that signals a bug. This lesson builds a focused scanner from the requests skills you already have.

You'll learn to

Structure a scanner as a differential detector
Spot reflection, error, and timing signals
Keep it robust across many targets

A scanner is a differential detector

Every useful scanner asks: does this response differ from the baseline in a way that suggests a vulnerability? Three classic differences:

import requests, time

MARKER = "zq7x9k"   # unique, unlikely to occur naturally

def test_param(session, url, param):
    # 1) Reflection (precondition for XSS): does our marker come back?
    r = session.get(url, params={param: MARKER}, timeout=10)
    if MARKER in r.text:
        print(f"[reflect] {url} param={param} reflects input")

    # 2) Error/behaviour differential (injection hint):
    base = session.get(url, params={param: "1"}, timeout=10)
    quote = session.get(url, params={param: "1'"}, timeout=10)
    if base.status_code != quote.status_code or \
       abs(len(base.text) - len(quote.text)) > 100:
        print(f"[diff] {url} param={param} behaves differently with a quote")

The reflection check injects a unique marker and looks for it echoed back. The differential check compares a benign value against one with a quote — a status or large length change hints at injection-sensitive processing. These are signals to investigate, not proof.

Robustness for real targets

from requests.adapters import HTTPAdapter, Retry

def make_session():
    s = requests.Session()
    s.headers.update({"User-Agent": "scanner/1.0"})
    retry = Retry(total=2, backoff_factor=0.3, status_forcelist=[429,500,502,503,504])
    s.mount("https://", HTTPAdapter(max_retries=retry))
    return s

Build one robust session — retries on transient failures, a consistent user-agent, a timeout on every request — and reuse it. Add s.proxies pointing at Burp and your scanner’s traffic flows through Burp for inspection.

Checkpoint

What makes a custom scanner often more effective than a generic off-the-shelf one?

Try it yourself

Write a function that takes a session, a URL, and a parameter name, injects a unique marker via that parameter, and reports whether the marker is reflected in the response. Run it against an authorised test app across a few parameters.

Key takeaways

A scanner detects differences from a baseline that signal bugs.
Reflection, status/length differentials, and timing are the core signals.
A match is a lead to verify, never a confirmed finding.
Build one robust session (retries, timeout, proxy) and reuse it.

Quick quiz

Next, automating source code review to find hardcoded secrets and dangerous functions across a whole codebase.

Was this lesson helpful?