What is Memoization in Python and how to use it correctly

Introduction

Memoization is a performance technique that shows up in real systems more often than people realize. Any time a system repeatedly computes the same expensive result—risk models, simulations, scoring engines, route planners—memoization can turn a slow, CPU-bound bottleneck into a predictable, fast operation.

In Python, memoization is easy to get wrong. It’s trivial to cache something; it’s much harder to do it safely, without corrupting results, leaking memory, or hiding stale data. This tutorial walks through memoization techniques that are appropriate for advanced, long-running systems, not just short scripts.

Background

Memoization means remembering the result of a function call so the next time the same inputs appear, the result is reused instead of recomputed.

For this to be correct, two conditions must be true:

The function must be deterministic: same inputs, same output.
The inputs used as cache keys must be immutable and fully describe the computation.

A helpful mental model is to think of memoization as slowly building a lookup table at runtime. Every unique input combination adds a new row. The core design questions are how big that table can grow, when entries should be removed, and when cached values are no longer valid.

Practical Scenario

Consider a backend service used by a trading platform to compute portfolio risk. The risk calculation depends on a portfolio identifier and a snapshot of market data. The calculation itself is expensive, involving repeated numerical transformations.

During trading hours, the same portfolios are queried repeatedly by different strategies. If the system recomputes risk every time, CPU usage spikes, latency increases, and the service becomes unstable under load. If caching is done incorrectly, the system may return outdated or incorrect risk values, which is even worse.

The Problem

This first version does no caching at all. It always recomputes risk from scratch.

Create a new file:

touch risk_engine.py

Run it using:

python3 risk_engine.py

import time
def calculate_portfolio_risk(portfolio_id: str, market_snapshot: tuple) -> float:
    total_risk = 0.0

    for price, volatility in market_snapshot:
        total_risk += price * volatility

    # Simulated heavy numerical work
    for _ in range(1_000_000):
        total_risk *= 1.0000001

    return total_risk

if __name__ == "__main__":
    snapshot = (
        (101.5, 0.02),
        (87.3, 0.03),
        (245.8, 0.015),
    )
    start = time.time()
    print(calculate_portfolio_risk("PORT-42", snapshot))
    print(calculate_portfolio_risk("PORT-42", snapshot))
    end = time.time()
    print("Execution time : ", end-start)

When the script runs, both calls return the same numeric value. However, both calls perform the full computation. The second call gains nothing from the first.
The results should look something like this:

9.212704727552312
9.212704727552312
Execution time :  0.1227576732635498

In isolation this looks harmless. In a service handling thousands of identical requests, it becomes a performance sink that burns CPU for no reason. We deliberately timed the execution of the program as we will compare this initial result with other execution as we update the code.

Manual Memoization with an Explicit Cache

The simplest improvement is to store results in a dictionary keyed by the function inputs.

Replace the existing code of the function calculate_portfolio_risk with the code below:

_risk_cache = {}

def calculate_portfolio_risk(portfolio_id: str, market_snapshot: tuple) -> float:
    cache_key = (portfolio_id, market_snapshot)

    if cache_key in _risk_cache:
        return _risk_cache[cache_key]

    total_risk = 0.0
    for price, volatility in market_snapshot:
        total_risk += price * volatility

    for _ in range(1_000_000):
        total_risk *= 1.0000001

    _risk_cache[cache_key] = total_risk
    return total_risk

With these changes the result will be similar to the one below:

9.212704727552312
9.212704727552312
Execution time :  0.07341837882995605

The cache key is a tuple of immutable values. This matters because mutable keys can change after insertion, corrupting the cache. The cache lives at module level, so results persist across calls.

Why this approach is better

Repeated calls with identical inputs become fast dictionary lookups. The expensive computation runs once leading to ~1.5x faster execution of the program: 0.1227576732635498 initially versus 0.07341837882995605 with simple memoization.

Caching only by portfolio_id would silently return incorrect results if market data changes. Memoization only works when the cache key fully represents the computation.

Execution is unchanged from the original example; the file location and command remain the same.

Using a Decorator to Isolate Memoization Logic

Embedding cache logic directly in functions does not scale well. A decorator keeps caching reusable and contained.

def memoize(function):
    cache = {}

    def wrapper(*args):
        if args in cache:
            return cache[args]

        result = function(*args)
        cache[args] = result
        return result

    return wrapper

@memoize
def calculate_portfolio_risk(portfolio_id: str, market_snapshot: tuple) -> float:
    total_risk = 0.0

    for price, volatility in market_snapshot:
        total_risk += price * volatility

    for _ in range(1_000_000):
        total_risk *= 1.0000001

    return total_risk

Each decorated function owns its own cache. Arguments are used directly as keys, relying on their immutability.

Why this approach is better

Business logic stays readable. Memoization becomes a deliberate choice rather than tangled logic inside the function.

Note: This approach is still a bit dangerous because the cache grows forever. In long-running processes with many unique inputs, this causes unbounded memory usage.

Execution remains the same as before.

Bounded Memoization with a Simple LRU Strategy

To keep memory usage predictable, the cache must evict old entries.

def lru_memoize(max_entries: int):
    def decorator(function):
        cache = {}
        usage_order = []

        def wrapper(*args):
            if args in cache:
                usage_order.remove(args)
                usage_order.append(args)
                return cache[args]

            result = function(*args)

            if len(cache) >= max_entries:
                oldest = usage_order.pop(0)
                del cache[oldest]

            cache[args] = result
            usage_order.append(args)
            return result

        return wrapper

    return decorator

@lru_memoize(max_entries=128)
def calculate_portfolio_risk(portfolio_id: str, market_snapshot: tuple) -> float:
    total_risk = 0.0

    for price, volatility in market_snapshot:
        total_risk += price * volatility

    for _ in range(1_000_000):
        total_risk *= 1.0000001

    return total_risk

The cache size is capped. When it fills up, the least recently used entry is removed. This mirrors real access patterns where recently used portfolios are more likely to be reused.

Why this is better

Memory usage is bounded and predictable, which is essential for services that run continuously.

Note: Naive eviction strategies are error-prone as random or time-only eviction often removes hot entries and keeps cold ones, reducing cache effectiveness and causing

performance instability.

Memoization with Explicit Invalidation

Some systems must drop cached data when external state changes, such as a new market data feed.

def memoize_with_invalidation(function):
    cache = {}

    def wrapper(*args):
        if args in cache:
            return cache[args]

        result = function(*args)
        cache[args] = result
        return result

    def invalidate():
        cache.clear()

    wrapper.invalidate = invalidate
    return wrapper

@memoize_with_invalidation
def calculate_portfolio_risk(portfolio_id: str, market_snapshot: tuple) -> float:
    total_risk = 0.0

    for price, volatility in market_snapshot:
        total_risk += price * volatility

    for _ in range(1_000_000):
        total_risk *= 1.0000001

    return total_risk

The function exposes an invalidate method. When new market data arrives, the cache can be explicitly cleared.

Why this is better

Correctness always outweighs performance. Explicit invalidation makes data freshness a conscious decision instead of an assumption. Relying on inputs to "eventually change" often leads to stale data persisting longer than intended, producing incorrect results with no obvious error.

Summary

Memoization is not about clever shortcuts. It is about controlling repetition safely.

In real Python systems, effective memoization requires:

correct and complete cache keys
bounded memory usage
explicit invalidation when external state changes

When these constraints are respected, memoization turns expensive computations into reliable, low-latency operations without sacrificing correctness.