Memory matters the moment data stops fitting comfortably in RAM. Two lines that look the same can behave very differently: sum([x*x for x in range(10**7)]) builds a 10-million-element list first, while sum(x*x for x in range(10**7)) streams through the same numbers using constant memory. Understanding when to pick each form is a genuinely useful skill.
Python exposes the tools you need to observe memory. sys.getsizeof(obj) returns the shallow byte size of an object (not its contents). tracemalloc can take before/after snapshots of the heap and tell you which line allocated the most. resource.getrusage reports process-level stats on Unix. Combined, they turn intuition into measurable numbers.
The most common memory mistake is building a list you only need to iterate once: for row in list(query()) when for row in query() would do, or "".join([str(n) for n in it]) when "".join(str(n) for n in it) uses half the memory. List comprehensions are wonderful; they just aren't always the right container.
At larger scales you also run into the hidden cost of small objects. A Python integer is a full object with reference count and type pointer, so a list of a million ints is bigger than a million C ints. For truly huge numeric data, array.array, numpy, or pandas store values in compact typed buffers — often an order of magnitude smaller.
Observing memory
sys.getsizeof([1, 2, 3]) returns the bytes of the list object itself, not the integers it points at. To account for contents, walk the structure or use pympler.asizeof (third party).
tracemalloc.start(), tracemalloc.take_snapshot() and snapshot.statistics("lineno") let you localize hotspots to specific lines — invaluable when optimizing a real script.
Choosing the right container
List of primitives: fine up to ~100k items. Generator expression: constant memory; use whenever you only iterate once. array.array: typed C buffer, about 4x smaller than a list of ints. numpy.ndarray: typed, vectorized, dramatically faster for numeric work.
For string-heavy data, str.join is the canonical way to assemble output, not += in a loop (which allocates a new string each iteration).
Measurement and container tools.
| Tool | Purpose |
|---|---|
sys.getsizeof(obj)function | Shallow byte size of one object. |
tracemallocmodule | Allocation snapshots by line number. |
resource.getrusagefunction | Process-level memory stats (Unix). |
array.arrayclass | Compact typed numeric buffer. |
numpy.ndarrayclass | Typed, vectorized numeric arrays. |
memoryview(buf)class | Zero-copy views into buffers. |
(e for x in it)syntax | Streaming alternative to list comprehension. |
pandas.DataFrameclass | Columnar storage for tabular data. |
Comparing Memory Usage in Data Processing code example
The script compares list, generator, and array storage for the same data and reports their sizes.
# Lesson: Comparing Memory Usage in Data Processing
import array
import sys
import tracemalloc
n = 1_000_000
# Size of a list vs a generator expression
as_list = list(range(n))
as_gen = (x for x in range(n))
as_arr = array.array("i", range(n))
print(f"list of {n:,} ints: {sys.getsizeof(as_list):11,} bytes (shell)")
print(f"gen of {n:,} ints: {sys.getsizeof(as_gen):11,} bytes")
print(f"array of {n:,} ints: {sys.getsizeof(as_arr):11,} bytes")
# Tracemalloc: find the hotspot between two snapshots
tracemalloc.start()
before = tracemalloc.take_snapshot()
big = [x * x for x in range(500_000)] # eager
s = sum(x * x for x in range(500_000)) # lazy
after = tracemalloc.take_snapshot()
top_stats = after.compare_to(before, "lineno")[:3]
print("\ntop 3 allocations:")
for stat in top_stats:
print(" ", stat)
tracemalloc.stop()
print("\nlen big:", len(big), "sum:", s)
Read the output keeping these facts in mind:
1) `getsizeof(list)` reports only the list header; the ints live elsewhere.
2) The generator's size is tiny regardless of the stream length.
3) `array('i', ...)` stores raw machine ints; far smaller than a list.
4) `tracemalloc` pinpoints the eager comprehension as the top allocator.
Convert eager code to lazy where appropriate.
import sys
# Before: build a huge intermediate list
total = sum([x for x in range(1_000_000) if x % 2 == 0])
# After: no intermediate list
total2 = sum(x for x in range(1_000_000) if x % 2 == 0)
assert total == total2
# String concatenation: the right way
parts = [str(i) for i in range(1000)]
s_right = ",".join(parts) # O(total length)
# Wrong way (quadratic, avoid!):
# s_wrong = ""
# for p in parts:
# s_wrong += p + ","
print(len(s_right))
A few size-ratio checks.
import array, sys
arr = array.array("i", range(1000))
lst = list(range(1000))
assert sys.getsizeof(arr) < sys.getsizeof(lst) * 2
gen = (x for x in range(1000))
assert sys.getsizeof(gen) < sys.getsizeof(lst)
Output varies by platform but looks like:
list of 1,000,000 ints: 8,000,056 bytes (shell)
gen of 1,000,000 ints: 208 bytes
array of 1,000,000 ints: 4,000,064 bytes
top 3 allocations:
len big: 500000 sum: 41666416667500000