Creating and Using Custom Data

Every class you define is a new data type as far as Python is concerned. You can create instances of it, check isinstance, pass it to type-hinted functions, store it in containers, and compare it to other values. Thinking about your classes as types — with a named role in the system — leads to cleaner APIs than thinking about them as containers for data alone.

Custom types earn their keep whenever the built-ins stop carrying enough meaning. A float is fine for a price; a Money class that keeps amount and currency together is better at a bank. A dict works for a user; a User class with validated fields and a few helper methods prevents a whole class of bugs. The tradeoff is roughly five lines of class scaffolding for much clearer error messages and richer autocomplete.

Python gives you three sensible starting points. A plain class with __init__ is the most flexible and the most verbose. A @dataclass adds __init__, __repr__ and __eq__ for you from a few field declarations, making it ideal for records. A typing.NamedTuple is the immutable cousin: everything @dataclass(frozen=True) offers, plus tuple-style unpacking and indexing.

Whichever form you pick, good custom types share the same habits: clear field names with type hints, a short docstring, a useful __repr__, and validation in __init__ if the class has invariants. Once that foundation is in place, new behaviors are cheap to add.

Three sensible starting points

class: maximum flexibility, verbose for simple data. @dataclass: perfect for records with identity based on fields. NamedTuple: record that is also a tuple, fully immutable, hashable.

@dataclass(frozen=True) gives you an immutable dataclass that also works as a dict key or set member — the middle ground between a NamedTuple and a full class.

Validation and invariants

Put validation in __init__. Raise ValueError (wrong value) or TypeError (wrong type) as appropriate, and include the offending value in the message. @dataclass offers __post_init__ for the same purpose after the auto-generated __init__ runs.

Invariants are rules that must hold for every instance: “amount ≥ 0”, “currency is a 3-letter ISO code”. Enforcing them at construction time means every downstream function can assume them for free.

Tools for building custom data types.

Tool	Purpose
`class Name:` statement	Plain class definition.
`@dataclass` decorator	Generates __init__, __repr__, __eq__.
`@dataclass(frozen=True)` decorator	Immutable dataclass, hashable.
`typing.NamedTuple` base class	Typed immutable record / tuple.
`field(default_factory=list)` function	Safe mutable defaults for dataclass fields.
`__post_init__` method	Hook for validation after dataclass init.
`enum.Enum` class	Named constants as a type.
`typing.NewType` helper	Creates distinct name for an existing type (static only).

Creating and Using Custom Data Types code example

The script compares the three common styles by defining the same concept (Money) three ways.

# Lesson: Creating and Using Custom Data Types
from dataclasses import dataclass, field
from typing import NamedTuple


# Style 1: plain class
class MoneyClass:
    def __init__(self, amount: float, currency: str = "EUR") -> None:
        if amount < 0:
            raise ValueError(f"amount must be >= 0: {amount}")
        if len(currency) != 3:
            raise ValueError(f"currency must be 3 letters: {currency!r}")
        self.amount = amount
        self.currency = currency

    def __repr__(self) -> str:
        return f"MoneyClass({self.amount}, {self.currency!r})"


# Style 2: dataclass with validation hook
@dataclass
class Money:
    amount: float
    currency: str = "EUR"
    tags: list[str] = field(default_factory=list)

    def __post_init__(self):
        if self.amount < 0:
            raise ValueError("amount must be non-negative")


# Style 3: immutable NamedTuple
class Price(NamedTuple):
    amount: float
    currency: str = "EUR"


a = MoneyClass(10, "USD")
b = Money(10, "USD")
c = Price(10, "USD")

print(a); print(b); print(c)

# dataclass-generated __eq__ compares by fields
print("b == Money(10, 'USD'):", b == Money(10, "USD"))

# NamedTuples are hashable and iterable
print("hash:", hash(c), "| tuple:", tuple(c))

# Validation
try:
    Money(-1)
except ValueError as err:
    print("caught:", err)

Compare the three styles:

1) Plain class: maximum flexibility, you write everything explicitly.
2) `@dataclass`: auto-generated `__init__`/`__repr__`/`__eq__`; validation in `__post_init__`.
3) `NamedTuple`: immutable, hashable, iterable, smallest memory footprint.
4) Pick the least powerful option that solves the problem.

Write a small dataclass with validation.

from dataclasses import dataclass

@dataclass
class Rectangle:
    width: float
    height: float
    def __post_init__(self):
        if self.width <= 0 or self.height <= 0:
            raise ValueError("width and height must be positive")
    @property
    def area(self) -> float:
        return self.width * self.height

r = Rectangle(3, 4)
print(r, "area=", r.area)
try:
    Rectangle(0, 1)
except ValueError as err:
    print("invalid:", err)

Verify the auto-generated equality semantics.

from dataclasses import dataclass
@dataclass
class X:
    a: int
    b: int
assert X(1, 2) == X(1, 2)
assert X(1, 2) != X(1, 3)
from typing import NamedTuple
class Y(NamedTuple):
    a: int
assert hash(Y(1)) == hash(Y(1))

Running prints:

MoneyClass(10, 'USD')
Money(amount=10, currency='USD', tags=[])
Price(amount=10, currency='USD')
b == Money(10, 'USD'): True
hash: 6793110 | tuple: (10, 'USD')
caught: amount must be non-negative

Three sensible starting points

Validation and invariants

Creating and Using Custom Data Types code example

Related Resources