A set is an unordered collection of unique, hashable items. You can think of it as a dict that only remembers its keys. Sets are the right tool any time you care about membership or uniqueness and not about order: removing duplicates from a list, checking whether something is in a whitelist, finding the common elements between two groups — all of these are one-liners with a set.
Because sets rely on hashing, their two hallmark properties come essentially for free. First, x in s is O(1) on average, regardless of how many items the set contains. Second, duplicates are automatically discarded: set([1, 2, 2, 3]) is {1, 2, 3}. The trade-off is that sets do not preserve insertion order and cannot contain unhashable items like lists.
Creating a set is slightly tricky because {} is an empty dict, not an empty set. Use set() for an empty set and {1, 2, 3} for non-empty ones. The set() constructor accepts any iterable and deduplicates as it goes, so set("hello") becomes {'h', 'e', 'l', 'o'}.
Python also ships a frozenset for cases where you need a set that cannot be changed afterwards. A frozenset is hashable, which means it can itself be used as a dict key or put inside another set — handy for representing, for example, immutable tags attached to a record.
What goes in, what does not
Every item in a set must be hashable: numbers, strings, tuples of hashables, frozensets. Putting a list into a set raises TypeError: unhashable type. If your data is mutable, convert it first (for example, turn each row into a tuple) or use a different container.
Sets compare by content: {1, 2} == {2, 1} is True because order doesn't matter. len(s) gives the size; for x in s iterates in an arbitrary (but consistent for a given interpreter run) order.
Mutable vs frozen
s.add(x), s.discard(x) and s.remove(x) modify a set in place. (discard is safe on missing items; remove raises.) You cannot use any of these on a frozenset; everything has to be passed to the constructor.
Because both flavors implement the same read-only interface (membership tests, union, intersection, iteration), you can often accept either and benefit from the caller's choice. Annotate parameters as Iterable[T] or AbstractSet[T] if you want to be maximally flexible.
The core tools for working with sets.
| Tool | Purpose |
|---|---|
setbuilt-in type | Mutable unordered collection of unique items. |
frozensetbuilt-in type | Immutable, hashable set. |
s.add(x)method | Adds x if it is not already present. |
s.discard(x)method | Removes x if present; no error otherwise. |
s.remove(x)method | Removes x; raises KeyError if absent. |
x in soperator | O(1) membership test on average. |
hash(obj)built-in | Raises TypeError for unhashable items. |
s.copy()method | Returns a shallow copy of the set. |
Understanding Sets and Their Properties code example
The script below exercises creation, membership, uniqueness and the hashability rule that trips beginners up.
# Lesson: Understanding Sets and Their Properties
from typing import Iterable
def dedupe(items: Iterable) -> set:
return set(items)
empty = set()
small = {1, 2, 3}
from_string = set("mississippi")
frozen = frozenset([1, 2, 3])
print("empty, len:", empty, len(empty))
print("small: ", small)
print("letters: ", from_string)
print("frozen: ", frozen)
small.add(4)
small.discard(10) # safe no-op
try:
small.remove(10)
except KeyError as err:
print("missing: ", err)
print("grown: ", sorted(small))
print("contains 2:", 2 in small)
print("equal? :", {1, 2, 3, 4} == small)
# Unhashable items fail loudly
try:
bad = {[1, 2]}
except TypeError as err:
print("unhashable:", err)
# Frozensets are hashable (can be dict keys)
tags = {frozenset({"py", "sql"}): "tag-1"}
print("tag key: ", tags[frozenset({"sql", "py"})])
As you read, look for four ideas:
1) `set()` is the only way to get an empty set; `{}` is a dict.
2) Adding an existing item is a no-op; there is no such thing as a duplicate entry.
3) `discard` is safe on missing items; `remove` raises KeyError.
4) Frozensets are hashable and therefore suitable as dict keys.
Try two tiny exercises that exercise uniqueness and hashability.
# Example A: deduplicate while comparing two batches
emails_a = ["A@x.com", "b@x.com", "A@x.com"]
unique = {e.lower() for e in emails_a}
print(unique)
# Example B: tag a record with a frozenset
record = {"id": 1, "tags": frozenset(["python", "oop"])}
print("python" in record["tags"])
Each assertion targets a core set property.
assert len({1, 1, 2}) == 2
assert 2 in {1, 2, 3} and 4 not in {1, 2, 3}
assert set("aab") == {"a", "b"}
assert isinstance(frozenset([1, 2]), frozenset)
Running prints roughly (iteration order may differ):
empty, len: set() 0
small: {1, 2, 3}
letters: {'m', 'i', 's', 'p'}
frozen: frozenset({1, 2, 3})
missing: 10
grown: [1, 2, 3, 4]
contains 2: True
equal? : True
unhashable: unhashable type: 'list'
tag key: tag-1