Data structures are the nouns and algorithms are the verbs of programming. Fluency with both is what lets you take a new problem and turn it into a correct solution in minutes instead of hours. This lesson focuses on the handful of structures and patterns you will reach for most often — arrays, hash maps, stacks, queues, heaps, trees, and graphs — and on the problem-solving process that wraps around them.
The first skill is recognizing structure. A problem mentions lookups by key? Hash map. Balanced parentheses or undo? Stack. Shortest path in an unweighted graph? BFS on a queue. Top-k items? Heap. Sorted ranges and nearest neighbors? Sorted list with bisect. Learn to match problem phrasings to these structures automatically.
The second skill is complexity intuition. Know the cost of each operation on each structure: dict/set lookup O(1), list in O(n), list append O(1), list insert(0) O(n), deque both-end O(1), heap push/pop O(log n). Ballpark the total complexity before you write anything. Most “too slow” problems have an O(n log n) solution waiting to be written.
The third skill is disciplined practice. Solve a problem daily. Time-box (30-45 minutes) and read the editorial if stuck. Re-implement from memory a few days later. Track which patterns you've covered in a journal. Consistency beats marathon sessions — 30 problems over a month beats 30 problems in one weekend.
Structures you will use constantly
Arrays and strings: slicing, two pointers, sliding windows. Hash maps: frequency counts, seen-before checks, grouping by key. Stacks: nested structures, expression parsing. Queues: BFS, buffered streams. Heaps: top-k, Dijkstra, streaming medians. Trees: recursion patterns. Graphs: BFS, DFS, topological sort.
Python wraps each of these cleanly: list/str, dict/set/Counter, list-as-stack, collections.deque, heapq, manual tree classes, dict[node, list[node]] for graphs.
Problem-solving checklist
Read twice. Write inputs/outputs. Consider brute force. Identify the structural feature. Pick a structure. Write pseudocode. Code carefully. Test with given examples plus a couple of edge cases.
Afterward: did you pick the right structure? Could any step be simpler? Write down the lesson so the next problem benefits.
The practice toolbox.
| Tool | Purpose |
|---|---|
Counterclass | Tallies by key. |
dequeclass | O(1) both-end queue. |
defaultdictclass | Dict with auto-default values. |
heapqmodule | Priority-queue primitives. |
bisectmodule | O(log n) sorted search. |
itertoolsmodule | Combinations, permutations, etc. |
BFSalgorithm | Shortest path in unweighted graphs. |
DFSalgorithm | Explore paths, detect cycles. |
Practicing Problem Solving and Data Structures code example
The script implements a stack-based bracket checker, a BFS over a graph, and a top-k with a heap — three canonical patterns in one file.
# Lesson: Practicing Problem Solving and Data Structures
import heapq
from collections import Counter, defaultdict, deque
# 1) Balanced brackets with a stack
def is_balanced(s: str) -> bool:
pairs = {")": "(", "]": "[", "}": "{"}
stack: list[str] = []
for ch in s:
if ch in "([{":
stack.append(ch)
elif ch in ")]}":
if not stack or stack[-1] != pairs[ch]:
return False
stack.pop()
return not stack
print("balanced 1:", is_balanced("([]{})"))
print("balanced 2:", is_balanced("([)]"))
# 2) BFS on an adjacency-dict graph: shortest path length
def bfs_shortest(graph: dict[str, list[str]], start: str, goal: str) -> int:
queue: deque[tuple[str, int]] = deque([(start, 0)])
seen = {start}
while queue:
node, dist = queue.popleft()
if node == goal:
return dist
for nb in graph[node]:
if nb not in seen:
seen.add(nb)
queue.append((nb, dist + 1))
return -1
graph = defaultdict(list, {
"A": ["B", "C"],
"B": ["A", "D"],
"C": ["A", "D", "E"],
"D": ["B", "C", "F"],
"E": ["C"],
"F": ["D"],
})
print("A->F distance:", bfs_shortest(graph, "A", "F"))
# 3) Top-k most common words with a heap
def top_k_words(text: str, k: int = 3) -> list[tuple[str, int]]:
counts = Counter(text.lower().split())
return heapq.nlargest(k, counts.items(), key=lambda p: p[1])
print("top 3:", top_k_words(
"one two two three three three four four four four five"
))
Each solution matches a pattern:
1) Stack-based parsing: the stack size equals unmatched opens.
2) BFS uses a deque as a FIFO queue; first reach = shortest distance.
3) `heapq.nlargest(k, ...)` is O(n log k), not O(n log n).
4) Domain structures (graph as adjacency dict) often drop out of the problem statement.
Group anagrams with a dict.
from collections import defaultdict
def group_anagrams(words):
buckets = defaultdict(list)
for w in words:
key = "".join(sorted(w))
buckets[key].append(w)
return list(buckets.values())
print(group_anagrams(["bat", "tab", "eat", "tea", "ate", "hi"]))
Sanity checks.
assert is_balanced("")
assert is_balanced("()")
assert not is_balanced("(")
assert bfs_shortest({"A": ["B"], "B": ["A"]}, "A", "B") == 1
Running prints:
balanced 1: True
balanced 2: False
A->F distance: 3
top 3: [('four', 4), ('three', 3), ('two', 2)]