Improving Code Structure

Refactoring is the practice of changing how code is written without changing what it does. You take a working program, split long functions, rename confusing variables, extract a class, remove duplication — and at each step the behavior stays identical. Done continuously, refactoring keeps a codebase shaped for change. Skipped, even good code slowly hardens into something nobody wants to touch.

The first requirement for safe refactoring is a test suite. Without tests, you can't tell whether a change preserved behavior. With even a handful of tests covering the main paths, refactoring becomes risk-free: run the tests before and after, and if both runs agree, you kept the contract. This is the real reason tests exist.

The second requirement is making small changes. Rename one variable at a time, not twenty. Extract one function before extracting another. Commit frequently. Each commit should either add a test or refactor, never both. That discipline makes it easy to find the bisecting commit if something breaks later.

The classic moves are well catalogued: Extract Function (pull a chunk out into its own named function), Inline Variable (replace a one-use temp with its expression), Replace Conditional with Polymorphism (turn type-checking if/else ladders into a small hierarchy), Introduce Parameter Object (group parameters that always travel together). Martin Fowler's Refactoring book is the reference.

Extract Function and rename

When a function grows past a screen or starts needing comments to explain sections, extract the sections into named helpers. The helpers' names become the comments. The original function stays short and reads like a narrative.

Rename variables and functions once their purpose becomes clearer than when you wrote them. Modern IDEs rename safely across a whole project in seconds.

Recognize patterns, resist big rewrites

Duplication is usually the signal that a refactor is possible. If two branches do the same work with slightly different parameters, introduce a helper. If two classes mirror each other's methods, factor out a shared base or a collaborator.

Avoid the “big rewrite” temptation. Continuous small refactors are safer, cheaper, and more effective than rewriting a module from scratch.

Refactoring aids and references.

ToolPurpose
Fowler, Refactoring
book
The canonical catalogue of moves.
pytest
test runner
Safety net for each refactor.
ruff
linter
Detects dead code, unused imports, complexity.
mypy
type checker
Catches contract breaks across modules.
radon
complexity tool
Reports cyclomatic complexity per function.
rope
refactor library
Automated rename/extract refactors.
notebook refactor
IDE feature
Built into modern IDEs (VS Code, PyCharm).
Refactoring ≠ rewriting
article
When refactoring stops and rewriting starts.

Improving Code Structure code example

The script takes a tangled function and refactors it in two steps, keeping a small test suite green.

# Lesson: Improving Code Structure
import unittest


# Step 0: the original, hard-to-follow implementation
def summarize_v0(rows: list[dict]) -> dict:
    total = 0
    n_paid = 0
    names = []
    for r in rows:
        if r.get("amount", 0) > 0:
            total += r["amount"]
            if r.get("paid"):
                n_paid += 1
                if r.get("name"):
                    names.append(r["name"])
    return {"total": total, "paid": n_paid, "names": sorted(names)}


# Step 1: extract a predicate, extract a paid-name collector
def _is_positive_amount(row: dict) -> bool:
    return row.get("amount", 0) > 0


def _paid_names(rows: list[dict]) -> list[str]:
    return sorted(r["name"] for r in rows if r.get("paid") and r.get("name"))


# Step 2: the refactored function reads like a summary itself
def summarize(rows: list[dict]) -> dict:
    active = [r for r in rows if _is_positive_amount(r)]
    return {
        "total": sum(r["amount"] for r in active),
        "paid": sum(1 for r in active if r.get("paid")),
        "names": _paid_names(active),
    }


class TestSummarize(unittest.TestCase):
    def setUp(self):
        self.rows = [
            {"amount": 10, "paid": True, "name": "ana"},
            {"amount": 0,  "paid": True, "name": "ben"},
            {"amount": 5,  "paid": False, "name": "cai"},
            {"amount": 20, "paid": True},  # no name
        ]

    def test_matches_original(self):
        self.assertEqual(summarize(self.rows), summarize_v0(self.rows))


unittest.TextTestRunner(verbosity=2).run(
    unittest.TestLoader().loadTestsFromTestCase(TestSummarize)
)
print("refactored result:", summarize([
    {"amount": 10, "paid": True, "name": "ana"},
    {"amount": 20, "paid": True},
]))

Observe each refactor move:

1) Extract Function: `_is_positive_amount` and `_paid_names` replace nested ifs.
2) The top-level `summarize` now reads like a summary of summaries.
3) The test compares new vs old output — behavior preservation is proven.
4) Small, green steps beat a big dramatic rewrite.

Refactor a function by extracting a predicate.

def keep_valid_emails(rows):
    out = []
    for r in rows:
        e = r.get("email", "")
        if "@" in e and len(e) < 100 and not e.startswith("."):
            out.append(e.lower())
    return sorted(set(out))

def _is_valid_email(e: str) -> bool:
    return "@" in e and len(e) < 100 and not e.startswith(".")

def keep_valid_emails_v2(rows):
    return sorted({r["email"].lower() for r in rows if _is_valid_email(r.get("email", ""))})

rows = [{"email": "A@x.com"}, {"email": ".bad"}, {"email": "a@x.com"}]
print(keep_valid_emails(rows) == keep_valid_emails_v2(rows))

Proof that the refactor preserves behavior.

def summarize_v0(rows):
    return [r for r in rows if r.get("x")]
def summarize_v1(rows):
    return [r for r in rows if r.get("x")]
rows = [{"x": 1}, {"x": 0}, {"x": 2}]
assert summarize_v0(rows) == summarize_v1(rows)

Running prints:

test_matches_original (__main__.TestSummarize.test_matches_original) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK
refactored result: {'total': 30, 'paid': 2, 'names': ['ana']}