Python › Programming Fundamentals
Lists, sets, and dictionaries
Collections hold multiple values, and knowing which one to reach for is a real skill. There are four, but three do almost all the work in security scripting: the list (your targets), the set (your deduper), and the dictionary (every request’s headers, params, and JSON). Get these and you can write real tools.
You'll learn to
- Pick the right container for the job
- Use a set to dedupe results instantly
- Use a dictionary to build request data
The four types at a glance
| Type | Looks like | Ordered? | Duplicates? | Use it for |
|---|---|---|---|---|
| list | [1, 2, 3] | yes | yes | an ordered, changeable sequence |
| tuple | (1, 2, 3) | yes | yes | a fixed sequence that won’t change |
| set | {1, 2, 3} | no | no | unique items; dedup; fast membership |
| dict | {"k": "v"} | yes | keys unique | labelled key → value data |
Lists — the default sequence
subdomains = ["www", "api", "admin"]
subdomains.append("dev") # add to the end
subdomains[0] # 'www' (first item, index 0)
subdomains[-1] # 'dev' (last item)
len(subdomains) # 4
for sub in subdomains: # loop over every item
print(sub)
A list is your default “bunch of things in order” — a list of targets, ports, or payloads to work through.
Sets — your dedup tool
found = ["a.com", "b.com", "a.com"]
unique = set(found) # {'a.com', 'b.com'} — duplicates gone
"a.com" in unique # True — membership check is very fast
Wrap any list in set() and duplicates vanish. This is the recon workhorse: subdomain tools and URL harvesters produce tons of overlap, and a set cleans it instantly.
Dictionaries — labelled data
headers = {
"User-Agent": "recon/1.0",
"Authorization": "Bearer abc123",
}
headers["User-Agent"] # look up by key → 'recon/1.0'
headers["Accept"] = "application/json" # add or replace a key
for key, value in headers.items(): # loop over pairs
print(key, "=", value)
The dictionary is the workhorse of HTTP and JSON work. Request headers, query parameters, JSON bodies, and parsed API responses are all dictionaries — labelled key: value data you look up by name. When you learn requests in the next module, you’ll pass dictionaries for headers and params constantly.
Checkpoint
You've collected 5000 URLs from three different tools, with lots of overlap. What's the simplest way to get a clean, deduplicated list?
Wrap them in a set: unique = set(all_urls). The set automatically removes duplicates. If you need a list back, use sorted(set(all_urls)) to also order them.
Try it yourself
Build a dictionary called params with keys "q" (value “test”) and "page" (value 2). Then add a third key "limit" with value 50. Print the dictionary, then loop over it with .items() printing each key and value.
Summary
Four containers, three you’ll use constantly. Lists [ ] are ordered, changeable sequences (your targets). Sets { } hold unique items and dedupe instantly (wrap a list in set()). Dictionaries {key: value} hold labelled data and are the shape of HTTP headers, params, JSON bodies, and API responses. Use .get() for keys that might be missing. Tuples are fixed lists you’ll meet occasionally.
Key takeaways
- List = ordered sequence; set = unique items + fast membership; dict = labelled data.
set(my_list)is the instant deduper.- Dicts are how you build headers, params, and JSON — learn them well.
dict.get(key)avoids the KeyError crash on missing keys.
Quick quiz
Next, conditions — how your script chooses what to do based on those True/False values.