Introduction to Python Collections

A collection is any object that holds more than one value at the same time. Instead of keeping ten student names in ten separate variables, you keep them in a single list, tuple, set or dictionary and operate on the whole group at once. Collections are what turn a handful of variables into a program that can process real data: files of records, rows from an API, the items in a shopping cart, or the edges of a graph.

Python ships four built-in collections you will see in almost every program: list, tuple, dict and set. They differ along three axes — whether they keep a defined order, whether you can change them after creation, and whether they index by position or by key. Learning to pick the right one is the first step; the details of each come in the following lessons.

A second, equally important idea is that Python distinguishes between mutable and immutable collections. A mutable collection (list, dict, set) can be modified in place: you can add, remove or replace items without creating a new object. An immutable collection (tuple, frozenset, str) cannot be changed; any "edit" returns a new object. Mutability affects not only what methods are available but also whether the value can be used as a dictionary key or set element.

All four collections implement the common protocol of iteration: you can write for item in collection: and get every element once. This is the glue that lets Python's for loops, list comprehensions, and functions like sum, max and sorted work uniformly across every kind of container. When you learn one collection you have already learned half of the other three.

The four built-in containers at a glance

A list is an ordered, mutable sequence indexed by position (items[0]). Use it when order matters and you need to grow or shrink the collection over time. A tuple is an ordered, immutable sequence; use it for fixed records like coordinates (x, y) or for function return values that should not be mutated.

A dict maps keys to values and preserves insertion order since Python 3.7; look up a value with scores["ana"]. A set is an unordered collection of unique hashable items; it is perfect for deduplication and membership tests. Both dict and set rely on hashing, which is why their keys and elements must be immutable.

Common operations that work everywhere

No matter which container you pick, len(c) gives the size, x in c tests membership, and for x in c: ... iterates over the elements. For dictionaries, iteration yields keys by default — use .values() or .items() for values or key/value pairs.

You can convert between collections whenever their contents are compatible. list(my_tuple), set(my_list) and dict(pairs) are one-line conversions that come up constantly in data cleanup: deduplicate with a set, sort with a list, look up with a dict.

These are the core tools for working with Python's built-in collections. Click any name to open the official documentation.

ToolPurpose
list
built-in type
Ordered, mutable sequence indexed by position.
tuple
built-in type
Ordered, immutable sequence for fixed records.
dict
built-in type
Key-to-value mapping preserving insertion order.
set
built-in type
Unordered collection of unique hashable items.
len()
built-in function
Returns the number of items in any container.
in
operator
Tests whether a value is a member of a container.
sorted()
built-in function
Returns a new sorted list from any iterable.
collections
standard-library module
Specialized containers (deque, Counter, defaultdict, ...).

Introduction to Python Collections code example

The script below creates one of each built-in collection from the same raw data and shows how the same questions produce different answers depending on the container.

# Lesson: Introduction to Python Collections
# Same raw data, four containers, four perspectives.
raw = ["ana", "ben", "ana", "cai", "ben", "dev"]

as_list = list(raw)
as_tuple = tuple(raw)
as_set = set(raw)
as_dict = {name: raw.count(name) for name in set(raw)}

print(f"list  ({len(as_list):2d}):", as_list)
print(f"tuple ({len(as_tuple):2d}):", as_tuple)
print(f"set   ({len(as_set):2d}):", sorted(as_set))
print(f"dict  ({len(as_dict):2d}):", dict(sorted(as_dict.items())))

print("\n'ana' in each container?")
for name, container in ("list", as_list), ("tuple", as_tuple), ("set", as_set), ("dict", as_dict):
    print(f"  {name:5s}: {'ana' in container}")

print("\nunique names:", len(as_set))
print("most frequent:", max(as_dict, key=as_dict.get))

Read the script with these questions in mind:

1) `list(raw)` and `tuple(raw)` preserve order and duplicates; only the mutability differs.
2) `set(raw)` deduplicates and throws away order, which is why we `sorted()` it for display.
3) The dict comprehension builds a name->count mapping in one pass using the set's unique names.
4) `in` works on every container but the cost differs: O(1) for set/dict, O(n) for list/tuple.

Try these snippets to feel how each container shapes the same data.

# Example A: turn a list of pairs into a dict
pairs = [("ana", 91), ("ben", 78), ("cai", 85)]
scores = dict(pairs)
print(scores["ben"])       # 78
print(list(scores.keys())) # ['ana', 'ben', 'cai']

# Example B: deduplicate while keeping the first-seen order
seen = set()
ordered_unique = [x for x in raw if not (x in seen or seen.add(x))]
print(ordered_unique)       # ['ana', 'ben', 'cai', 'dev']

Run these to confirm the four containers really do what the text says.

assert len(as_list) == 6 and len(as_set) == 4
assert as_list[0] == "ana" and as_tuple[0] == "ana"
assert "ana" in as_dict and as_dict["ana"] == 2
assert type(as_set) is set and type(as_tuple) is tuple

Running the script prints:

list  ( 6): ['ana', 'ben', 'ana', 'cai', 'ben', 'dev']
tuple ( 6): ('ana', 'ben', 'ana', 'cai', 'ben', 'dev')
set   ( 4): ['ana', 'ben', 'cai', 'dev']
dict  ( 4): {'ana': 2, 'ben': 2, 'cai': 1, 'dev': 1}

'ana' in each container?
  list : True
  tuple: True
  set  : True
  dict : True

unique names: 4
most frequent: ana