Regex › Source Code Review Regex
Finding hardcoded secrets across languages
Whatever the language, developers hardcode secrets in a handful of recognisable shapes: an assignment whose name says ‘secret’ and whose value is a string. This lesson builds patterns that catch those across seven languages at once — the core of automated source review.
You'll learn to
- Match credential assignments generically
- Adapt to each language's syntax
- Cut false positives with value heuristics
The universal shape
Across languages, a hardcoded secret looks like: a key whose name suggests sensitivity, an assignment operator, and a quoted string value.
(?i)(password|passwd|secret|api[_-]?key|token|auth|credential)
\s*[:=]\s*
['"][^'"]{4,}['"]
Read it: a sensitive name (case-insensitive), an assignment (: or =, as used by config, JSON, and most languages), then a quoted value of at least a few characters. This one pattern flags credential assignments in Python, JS, PHP, Java, Ruby, Go, and config files.
Language-specific tightening
Python/JS: api_key = "..." name = "..."
PHP: $apiKey = "..."; define('KEY', '...')
Java: String key = "...";
YAML/env: api_key: ... API_KEY=...
JSON: "apiKey": "..."
Checkpoint
Why does a single regex keyed on names like password, secret, api_key, and token find hardcoded credentials across many different programming languages?
Because the naming convention for secrets is near-universal — developers in every language call them key, token, secret, password, or similar. The pattern matches that sensitive name, followed by an assignment operator (colon or equals, which covers config, JSON, and most languages), followed by a quoted string value. Since the *shape* (named assignment of a quoted value) is consistent across languages even though syntax differs slightly, one pattern catches the credential in Python, JS, PHP, Java, and more.
Try it yourself
Write a pattern that matches an assignment where the variable name contains ‘secret’ or ‘token’ (case-insensitive) and the value is a quoted string of at least eight characters. Test it mentally against a real-looking assignment and against a placeholder like an empty or ‘changeme’ value, and note how a length requirement reduces false positives.
Key takeaways
- Secrets share a shape: sensitive name + assignment + quoted value.
- One case-insensitive pattern catches credentials across seven languages.
- Tighten per-language only when the generic pass is too noisy.
- Require realistic values (length, character mix, prefixes) to cut false positives; scan git history too.
Quick quiz
Next, regex for API recon — extracting endpoints, parameters, and routes from text at scale.