Dict and set comprehensions carry the same philosophy as list comprehensions into mapping and membership territory. A dict comprehension is {key_expr: value_expr for x in it [if cond]}; a set comprehension is the same without the colon. Both produce a fresh object from an iterable in a single expression, and both are the idiomatic choice when you want to transform or filter into a new dict or set.
Dict comprehensions shine whenever a mapping is derived from something else: a list of pairs, two parallel sequences, an existing dict you want to transform or filter. {n: n*n for n in range(5)} gives you {0: 0, 1: 1, 2: 4, 3: 9, 4: 16} in one line; {k: v.upper() for k, v in d.items()} returns a copy with uppercased values.
Set comprehensions are the natural partner when you also need deduplication. {word.lower() for word in text.split()} gives you a set of unique, lowercased tokens; the set takes care of duplicates implicitly. For problems where the final order doesn't matter, a set comprehension is cleaner than building a list and converting.
The familiar restrictions apply. Dict keys and set elements must be hashable; if you try to build a set of lists you'll get TypeError. Comprehensions run eagerly and allocate memory for the whole result; if the input is huge and you only need to iterate once, use a generator expression instead.
From pairs and from other dicts
dict(zip(keys, values)) is often shorter than a dict comprehension when you already have two parallel sequences. Use the comprehension form when you need a transformation: {k: int(v) for k, v in pairs}. Inverting a dict is the classic one-liner: {v: k for k, v in d.items()} (remember it silently drops duplicates).
Filter values as they go in: {k: v for k, v in d.items() if v is not None}. Transform keys at the same time: {k.upper(): v for k, v in d.items()}. Both read naturally once you're used to the pattern.
Deduplication and grouping with sets
{token.lower() for token in tokens} is the textbook dedup. For grouping into sets of unique items, fall back to defaultdict(set): groups.setdefault(key, set()).add(item). Set comprehensions don't handle grouping in a single expression, and that's fine.
If you need a stable order after deduplication, a list comprehension with a "seen" set is the idiom: [x for x in xs if x not in seen and not seen.add(x)]. Slightly hacky, but well-known and efficient.
Dict and set comprehensions in context.
| Tool | Purpose |
|---|---|
{k: v for ...}syntax | Dict comprehension. |
{e for ...}syntax | Set comprehension. |
dict(zip(k, v))idiom | Pairs two parallel iterables. |
d.items()method | Pairs of (key, value) for iteration. |
defaultdict(set)class | Auto-creates a set per missing key. |
frozenset(...)built-in | Immutable, hashable set. |
itemgetter(i)factory | Extracts position i; handy as a key in loops. |
Counterclass | Specialized dict for counting occurrences. |
Creating Dictionaries and Sets Efficiently code example
The script derives dicts and sets from the same data source using comprehensions throughout.
# Lesson: Creating Dictionaries and Sets Efficiently
from collections import defaultdict
text = "The quick brown fox jumps over the lazy dog the quick fox"
tokens = text.lower().split()
# Set comprehension: unique tokens
unique = {t for t in tokens}
print("unique:", sorted(unique))
# Dict comprehension: token -> length
lengths = {t: len(t) for t in unique}
print("lengths:", dict(sorted(lengths.items())))
# Dict comprehension from pairs + filter
pairs = [("host", "localhost"), ("port", "5432"), ("debug", "")]
config = {k: v for k, v in pairs if v}
print("config:", config)
# Invert a dict (values must be unique to avoid collisions)
src = {"a": 1, "b": 2, "c": 3}
inv = {v: k for k, v in src.items()}
print("inverted:", inv)
# Group tokens by first letter (not a one-liner; defaultdict helps)
groups: defaultdict[str, set[str]] = defaultdict(set)
for t in tokens:
groups[t[0]].add(t)
print("groups:", {k: sorted(v) for k, v in sorted(groups.items())})
# Frozenset as a dict key
seen: dict[frozenset, str] = {}
seen[frozenset({"py", "sql"})] = "tag-1"
print("frozen key:", seen[frozenset({"sql", "py"})])
Watch how each comprehension earns its keep:
1) Set comprehension replaces 'start with set() and add inside a loop'.
2) Dict comprehension from `.items()` is the standard copy-with-transform form.
3) Filtering in the same expression avoids a second pass.
4) Grouping needs `defaultdict(set)`; there's no one-liner equivalent and that's fine.
Practice with a frequency map and a whitelist filter.
from collections import Counter
words = "red blue red green blue red".split()
counts = Counter(words)
print(counts)
# Keep keys only if value is above a threshold
freq = {w: n for w, n in counts.items() if n >= 2}
print(freq)
# Whitelisted tokens
allowed = {"red", "green"}
only_allowed = {w for w in words if w in allowed}
print(sorted(only_allowed))
Core comprehension invariants.
assert {n: n*n for n in range(3)} == {0: 0, 1: 1, 2: 4}
assert {n % 2 for n in range(5)} == {0, 1}
assert dict(zip("ab", [1, 2])) == {"a": 1, "b": 2}
d = {"a": 1, "b": 2}
assert {v: k for k, v in d.items()} == {1: "a", 2: "b"}
Running prints:
unique: ['brown', 'dog', 'fox', 'jumps', 'lazy', 'over', 'quick', 'the']
lengths: {'brown': 5, 'dog': 3, 'fox': 3, 'jumps': 5, 'lazy': 4, 'over': 4, 'quick': 5, 'the': 3}
config: {'host': 'localhost', 'port': '5432'}
inverted: {1: 'a', 2: 'b', 3: 'c'}
groups: {'b': ['brown'], 'd': ['dog'], 'f': ['fox'], 'j': ['jumps'], 'l': ['lazy'], 'o': ['over'], 'q': ['quick'], 't': ['the']}
frozen key: tag-1