Python Python

Building a Type-Safe Trade Data Library with Python's typing Module

Dima Iun 24, 2026

Introduction

Python has always been dynamically typed — a variable can hold any value, a function can return anything, and the interpreter never complains until something breaks at runtime. For a ten-line script, this is fine. For a shared library that five engineers call from fifteen modules, it becomes a serious maintenance problem. A function named calculate_fee(trade, rate) tells you nothing about what trade is supposed to be, whether rate is a float or a decimal string, or what the return value represents. The only documentation is the runtime error you get when you pass the wrong thing.

Python's typing module, available since Python 3.5 and significantly improved in every subsequent release, adds a layer of machine-readable documentation to function signatures and variable declarations. Type annotations do not change runtime behavior — Python still executes them as ordinary code — but they feed into static analysis tools like mypy and pyright, which catch type mistakes before the code runs. They also power IDE autocompletion and inline documentation that makes a library genuinely usable without reading its source.

This tutorial builds a trade data library for a financial analytics service, starting from completely unannotated code and progressively adding the type vocabulary that makes the library self-documenting and statically verifiable. Every section introduces one concept — basic annotations, Optional and Union, generic containers, TypeVar, Protocol, TypedDict — and shows both how the annotation is written and what class of bug it prevents.


Background

Type annotations in Python are syntactic sugar: def f(x: int) -> str: attaches annotation objects to the function but does not enforce anything at runtime. The enforcement comes from external tools:

  • mypy: the canonical static type checker for Python; run with python3 -m mypy filename.py
  • pyright: Microsoft's type checker, used by Pylance in VS Code
  • IDE support: annotations enable autocompletion, inline documentation, and "go to definition" for the types a function returns

Key vocabulary used throughout this tutorial:

  • Optional[T]: shorthand for Union[T, None]; indicates a value that might be absent
  • Union[A, B]: a value that can be either type A or type B; written A | B in Python 3.10+
  • TypeVar: a placeholder for a type that the caller provides; used to write generic functions
  • Protocol: defines an interface structurally — any class with the required methods satisfies it, without inheriting from it
  • TypedDict: a dict with a fixed set of keys and typed values; used when a function must accept or return a dictionary with a known shape
  • TYPE_CHECKING: a flag that is False at runtime and True only during static analysis; used to avoid circular imports in type annotations

Practical Scenario

A quantitative analytics team maintains a Python library used by three services: a risk engine, a reporting dashboard, and a real-time pricing feed. The library processes trade records — instrument tickers, quantities, prices, and computed notionals — and provides functions to filter, aggregate, and transform them.

Six engineers contribute to the library. Without type annotations, every function call is a guessing game: does get_trades_by_ticker return a list or a generator? Does compute_notional accept a dict or a Trade object? Does aggregate_by_currency return a dict keyed by string or by some enum? The answers are buried in the implementation or, worse, in each engineer's memory. When a new engineer joins and passes a float where an int is expected for quantity, the error appears as a wrong calculation in a production report two weeks later.

The team needs a library where a type checker can verify every call site against the function signature before the code is deployed, IDE autocompletion shows exactly what fields a return value has, and adding a new field to a trade record triggers type errors at every caller that does not handle it — rather than silent runtime surprises.


The Problem

Create the initial library:

touch trades.py

Run it using:

python3 trades.py
def compute_notional(trade):
    return trade["quantity"] * trade["price"]

def get_trades_above(trades, threshold):
    return [t for t in trades if compute_notional(t) >= threshold]

def summarize(trades):
    total = 0
    for t in trades:
        total += compute_notional(t)
    return {"count": len(trades), "total_notional": total}

trades = [
    {"ticker": "AAPL", "quantity": 100, "price": 182.30},
    {"ticker": "MSFT", "quantity": 50,  "price": 415.10},
    {"ticker": "TSLA", "quantity": 200, "price": 172.80},
]

# A caller passes a string quantity — no error until compute_notional
bad_trade = {"ticker": "NVDA", "quantity": "200", "price": 875.50}
try:
    print(compute_notional(bad_trade))
except TypeError as e:
    print(f"Runtime error: {e}")

result = summarize(get_trades_above(trades, 18000))
print(f"Summary: {result}")


Runtime error: can't multiply sequence by non-int of type 'float'
Summary: {'count': 2, 'total_notional': 54765.0}


The "200" string in bad_trade causes a TypeError at runtime. A static type checker would have caught this immediately — but only if compute_notional had a signature declaring that trade is a dict with a numeric quantity field, not an arbitrary object. Without annotations, the type checker has nothing to check.


Basic Function Annotations

Function parameters and return types are annotated with : Type and -> Type. The annotation applies to the function's interface — what callers must provide and what they can rely on receiving back.

Replace the entire content of trades.py with the following:

def compute_notional(quantity: int, price: float) -> float:
    return quantity * price

def format_trade(ticker: str, quantity: int, price: float) -> str:
    notional = compute_notional(quantity, price)
    return f"{ticker}: qty={quantity} px={price:.2f} notional={notional:,.2f}"

# Correct calls
print(format_trade("AAPL", 100, 182.30))
print(format_trade("MSFT", 50, 415.10))

# A static type checker flags this: quantity should be int, not str
# At runtime Python executes it and produces a TypeError
try:
    print(compute_notional("100", 182.30))   # type: ignore — shown for demonstration
except TypeError as e:
    print(f"Runtime TypeError: {e}")


AAPL: qty=100 px=182.30 notional=18,230.00
MSFT: qty=50 px=415.10 notional=20,755.00
Runtime TypeError: can't multiply sequence by non-int of type 'float'


The annotations quantity: int and price: float are declarations of intent. Running python3 -m mypy trades.py on the commented line would report error: Argument 1 to "compute_notional" has incompatible type "str"; expected "int". The runtime behavior is unchanged, but the error is now catchable before deployment.

Function signatures become machine-readable contracts. An IDE shows (quantity: int, price: float) -> float in the autocompletion tooltip — the caller knows exactly what to pass without reading the body. A type checker verifies every call site automatically. The # type: ignore comment above is the escape hatch for the rare case where you deliberately bypass checking; its presence is itself a signal that something unusual is happening.


Optional and Union for Values That May Be Absent

A trade might not have a sector classification yet — the enrichment service has not processed it. A function that returns the sector either returns a str or returns None. Without the annotation Optional[str], callers cannot know whether they need to guard against None, and a type checker cannot detect the case where they forget to.

Replace the entire content of trades.py with the following:

from typing import Optional, Union

SECTOR_MAP: dict[str, str] = {
    "AAPL": "Technology",
    "MSFT": "Technology",
    "JPM":  "Financials",
}

def get_sector(ticker: str) -> Optional[str]:
    return SECTOR_MAP.get(ticker)

def describe_trade(ticker: str, quantity: int, price: float) -> str:
    sector: Optional[str] = get_sector(ticker)
    if sector is None:
        sector_label = "Unclassified"
    else:
        sector_label = sector
    return f"{ticker} [{sector_label}]: qty={quantity} @ {price:.2f}"

# Union: a field that can come in as int or float depending on the data source
def normalize_quantity(raw: Union[int, float]) -> int:
    return int(raw)

print(describe_trade("AAPL", 100, 182.30))
print(describe_trade("TSLA", 200, 172.80))   # TSLA not in SECTOR_MAP
print(f"Normalized: {normalize_quantity(150.0)}")
print(f"Normalized: {normalize_quantity(200)}")


AAPL [Technology]: qty=100 @ 182.30
TSLA [Unclassified]: qty=200 @ 172.80
Normalized: 150
Normalized: 200


Optional[str] is equivalent to Union[str, None]. A type checker enforces that callers of get_sector cannot use the return value as a str without a None check — accessing .upper() on the result directly would be a type error. In Python 3.10+, write str | None and int | float instead of Optional[str] and Union[int, float].

Optional[T] makes the possibility of absence explicit in the signature. Without it, every caller guesses whether to guard against None, and some forget. With it, a type checker enforces the guard at every call site without any runtime overhead.

Note: Optional[str] does not prevent None from being passed at runtime — it only signals the intent and enables static checking. Use assert sector is not None or raise an exception to enforce it at runtime in critical paths.


Generic Containers: list, dict, tuple

Functions that operate on collections should declare not just that they accept list but what type the list contains. A list[dict] is still untyped at the element level; a list[Trade] tells both the caller and the type checker exactly what the list holds.

Replace the entire content of trades.py with the following:

from dataclasses import dataclass

@dataclass
class Trade:
    ticker:   str
    quantity: int
    price:    float

    @property
    def notional(self) -> float:
        return self.quantity * self.price

def filter_above(trades: list[Trade], threshold: float) -> list[Trade]:
    return [t for t in trades if t.notional >= threshold]

def group_by_ticker(trades: list[Trade]) -> dict[str, list[Trade]]:
    groups: dict[str, list[Trade]] = {}
    for t in trades:
        groups.setdefault(t.ticker, []).append(t)
    return groups

def trade_summary(trade: Trade) -> tuple[str, int, float]:
    return (trade.ticker, trade.quantity, trade.notional)

trades = [
    Trade("AAPL", 100, 182.30),
    Trade("MSFT", 50,  415.10),
    Trade("AAPL", 200, 182.30),
    Trade("TSLA", 30,  172.80),
]

large = filter_above(trades, 18000.0)
print(f"Large trades: {[t.ticker for t in large]}")

grouped = group_by_ticker(trades)
for ticker, group in grouped.items():
    total = sum(t.notional for t in group)
    print(f"{ticker}: {len(group)} trades, total notional {total:,.2f}")

summary = trade_summary(trades[0])
ticker, qty, notional = summary
print(f"Summary tuple: {ticker=} {qty=} {notional=:.2f}")


Large trades: ['MSFT', 'AAPL']
AAPL: 2 trades, total notional 54,690.00
MSFT: 1 trades, total notional 20,755.00
TSLA: 1 trades, total notional 5,184.00
Summary tuple: ticker='AAPL' qty=100 notional=18230.00


list[Trade] (Python 3.9+) requires no import and is equivalent to the older List[Trade] from typing. dict[str, list[Trade]] specifies both the key type and the nested value type. tuple[str, int, float] describes a fixed-length tuple with specific types at each position — distinct from tuple[str, ...] which is a variable-length tuple of strings.

A type checker now knows that filter_above returns list[Trade], so accessing .ticker on each element is valid, and passing the result to group_by_ticker is valid. Passing a list[str] to filter_above is a type error caught before runtime. Without the annotation, the function signature communicates nothing about element types and the type checker cannot verify callers.


TypeVar for Generic Functions

Some functions operate on collections without caring what they contain — they filter, sort, or batch items using only a key function or a count. A function that returns the first n items of any list should not be restricted to list[Trade]; it should work for list[str], list[int], or any list. TypeVar expresses this: the function is generic in T, and whatever T the caller provides is what it returns.

Replace the entire content of trades.py with the following:

from dataclasses import dataclass
from typing import TypeVar, Callable

T = TypeVar("T")

@dataclass
class Trade:
    ticker:   str
    quantity: int
    price:    float

    @property
    def notional(self) -> float:
        return self.quantity * self.price

def first_n(items: list[T], n: int) -> list[T]:
    return items[:n]

def top_by(items: list[T], key: Callable[[T], float], n: int = 3) -> list[T]:
    return sorted(items, key=key, reverse=True)[:n]

trades = [
    Trade("AAPL", 100, 182.30),
    Trade("MSFT", 50,  415.10),
    Trade("NVDA", 300, 875.50),
    Trade("TSLA", 30,  172.80),
    Trade("AMZN", 80,  185.20),
]

# Works for list[Trade]
top_trades = top_by(trades, key=lambda t: t.notional, n=2)
print("Top 2 by notional:")
for t in top_trades:
    print(f"  {t.ticker}: {t.notional:,.2f}")

# Works identically for list[str]
names = ["Zebra", "Apple", "Mango", "Banana"]
top_names = top_by(names, key=lambda s: len(s), n=2)
print(f"\nLongest 2 names: {top_names}")

# first_n preserves the element type
first_two: list[Trade] = first_n(trades, 2)
print(f"\nFirst 2 tickers: {[t.ticker for t in first_two]}")


Top 2 by notional:
  NVDA: 262,650.00
  MSFT: 20,755.00

Longest 2 names: ['Banana', 'Apple']

First 2 tickers: ['AAPL', 'MSFT']


T = TypeVar("T") declares a type variable. When a type checker sees first_n(trades, 2) where trades: list[Trade], it infers T = Trade and knows the return type is list[Trade]. When it sees first_n(names, 2) where names: list[str], it infers T = str and the return type is list[str]. The same function handles both cases without any Union or overloading.

TypeVar expresses "the output type is the same as the input type" — a relationship that Union cannot express. Without it, a generic function would return list[Any], losing all type information about its result. With TypeVar, callers get full type inference on the return value.


Protocol for Structural Typing

A Protocol defines an interface by the methods a type must have, not by inheritance. Any class that implements the required methods satisfies the protocol, without declaring it anywhere. This is structural typing — the same duck typing Python has always supported, but with static verification.

Replace the entire content of trades.py with the following:

from typing import Protocol, runtime_checkable

@runtime_checkable
class Priceable(Protocol):
    @property
    def notional(self) -> float: ...
    ticker: str

class Trade:
    def __init__(self, ticker: str, quantity: int, price: float) -> None:
        self.ticker   = ticker
        self.quantity = quantity
        self.price    = price

    @property
    def notional(self) -> float:
        return self.quantity * self.price

class Bond:
    def __init__(self, ticker: str, face_value: float, coupon_rate: float) -> None:
        self.ticker      = ticker
        self.face_value  = face_value
        self.coupon_rate = coupon_rate

    @property
    def notional(self) -> float:
        return self.face_value

def total_notional(instruments: list[Priceable]) -> float:
    return sum(i.notional for i in instruments)

def largest(instruments: list[Priceable]) -> Priceable:
    return max(instruments, key=lambda i: i.notional)

trades = [Trade("AAPL", 100, 182.30), Trade("MSFT", 50, 415.10)]
bonds  = [Bond("US10Y", 100_000, 0.045), Bond("DE5Y", 50_000, 0.032)]
mixed  = trades + bonds  # type: ignore  # mixed list for demonstration

print(f"Trade total notional: {total_notional(trades):,.2f}")
print(f"Bond total notional:  {total_notional(bonds):,.2f}")
print(f"Largest instrument:   {largest(mixed).ticker}")

# runtime_checkable enables isinstance checks
print(f"Trade is Priceable: {isinstance(trades[0], Priceable)}")
print(f"Bond is Priceable:  {isinstance(bonds[0], Priceable)}")


Trade total notional: 39,005.00
Bond total notional:  150,000.00
Largest instrument:   US10Y
Trade is Priceable: True
Bond is Priceable:  True


Trade and Bond never mention Priceable. They satisfy the protocol because they have a notional property and a ticker attribute — that is all the protocol requires. A type checker accepts both as Priceable without any explicit declaration.

Protocol decouples the interface definition from the implementing classes. The analytics library defines Priceable without any knowledge of Trade, Bond, or future instrument types. Third-party classes that happen to have a notional property work automatically — no modification required. Compared to an abstract base class, Protocol requires no change to existing code.

Note: @runtime_checkable enables isinstance checks against the Protocol. Without it, isinstance(obj, Priceable) raises TypeError. The check only verifies that the required methods exist as attributes — it does not verify their signatures.


TypedDict for Dictionary Shapes

Some APIs return dictionaries with a fixed, known shape — a JSON response, a database row, a configuration file. TypedDict attaches type information to a specific dict shape, giving type checkers and IDEs full visibility into which keys exist and what types their values have.

Replace the entire content of trades.py with the following:

from typing import TypedDict, NotRequired

class TradeRecord(TypedDict):
    ticker:   str
    quantity: int
    price:    float
    currency: str

class EnrichedRecord(TypedDict):
    ticker:        str
    quantity:      int
    price:         float
    currency:      str
    notional:      float
    notional_usd:  float
    sector:        NotRequired[str]   # field may be absent

FX_RATES: dict[str, float] = {"USD": 1.0, "EUR": 1.08, "GBP": 1.26}

def enrich(record: TradeRecord) -> EnrichedRecord:
    notional = record["quantity"] * record["price"]
    rate = FX_RATES.get(record["currency"], 1.0)
    enriched: EnrichedRecord = {
        "ticker":       record["ticker"],
        "quantity":     record["quantity"],
        "price":        record["price"],
        "currency":     record["currency"],
        "notional":     round(notional, 2),
        "notional_usd": round(notional * rate, 2),
    }
    return enriched

def format_enriched(record: EnrichedRecord) -> str:
    sector = record.get("sector", "Unclassified")
    return (f"{record['ticker']} [{sector}] "
            f"qty={record['quantity']} "
            f"notional_usd={record['notional_usd']:,.2f}")

raw: TradeRecord = {"ticker": "AAPL", "quantity": 100, "price": 182.30, "currency": "USD"}
enriched = enrich(raw)
print(format_enriched(enriched))

enriched["sector"] = "Technology"
print(format_enriched(enriched))


AAPL [Unclassified] qty=100 notional_usd=18,230.00
AAPL [Technology] qty=100 notional_usd=18,230.00


TypedDict makes the dict's shape visible to the type checker. Accessing record["tickr"] (typo) is a type error. Adding record["extra_field"] = "value" to a TradeRecord is a type error. NotRequired[str] marks sector as an optional key — it may or may not be present, and the type checker enforces that callers use .get() or check for its presence rather than accessing it directly.

TypedDict closes the gap between "this dict has a known shape" and "the type checker can verify it." Without TypedDict, a function returning dict tells callers nothing — every key access is an Any lookup. With TypedDict, IDE autocompletion shows exactly which keys exist, and accessing a non-existent key is caught at analysis time rather than raising KeyError at runtime.


Summary

The trade data library built in this tutorial covers the full vocabulary of Python's type annotation system, applied to a realistic financial data processing context:

  • Function parameter and return annotations are machine-readable contracts that feed static type checkers and IDE autocompletion without changing runtime behavior; python3 -m mypy validates them before deployment
  • Optional[T] (equivalently T | None) makes the possibility of absent values explicit in the signature, forcing callers to handle the None case rather than guessing whether a guard is needed
  • list[T], dict[K, V], and tuple[T1, T2, ...] specify element types for generic containers; a list[Trade] conveys far more than a bare list and enables the type checker to verify element-level operations
  • TypeVar expresses that the output type of a generic function matches its input type — something Union cannot express — enabling full type inference on return values of functions like first_n and top_by
  • Protocol defines structural interfaces that any class with the required methods satisfies without inheritance, preserving duck typing while adding static verification; @runtime_checkable enables isinstance checks at the cost of attribute-only verification
  • TypedDict attaches type information to dictionary shapes for APIs, JSON responses, and database rows; NotRequired[T] marks fields that may be absent, enforcing .get() usage at every access

Trebuie să fii autentificat pentru a accesa laboratorul cloud.

Autentifică-te