Python Python

Python Exception Handling Tutorial: Production Pipeline Patterns

Dima Май 13, 2026

Introduction

Every program that processes real data encounters failures. Files go missing, records arrive malformed, external services time out, and type conversions fail on values nobody anticipated. How a program responds to those failures determines whether it is a debugging nightmare or a system you can trust in production.

Python's exception handling machinery is rich enough to cover every case — but it is easy to misuse in ways that silently discard errors, obscure root causes, and make problems harder to diagnose than if the program had simply crashed. This tutorial walks through a realistic data pipeline and shows, step by step, how to move from code that swallows every failure to code that handles errors precisely, preserves context, logs correctly, and cleans up reliably.


Background

Python exceptions form a class hierarchy. Every exception is an instance of BaseException. The subtree you almost always care about is Exception and its descendants — ValueError, FileNotFoundError, KeyError, and so on. A few critical exceptions — KeyboardInterrupt, SystemExit, MemoryError — live directly under BaseException and are explicitly excluded from except Exception for good reason: they signal conditions that almost never represent recoverable errors.

Two rules govern almost every exception handling decision:

  • Catch the most specific exception that covers your case. except ValueError beats except Exception, which beats bare except.
  • Only catch what you know how to handle. If you cannot do something useful with an exception — log it, convert it, retry the operation — let it propagate to a layer that can.

Practical Scenario

A financial data team runs a nightly pipeline that reads transaction records from a CSV export, validates each record, enriches it with a currency conversion, and writes the results for downstream reporting. The pipeline processes thousands of records. Some records are always malformed. Some currencies are unrecognised. Occasionally the input file itself is missing or unreadable.

The team needs the pipeline to process every valid record without stopping, report exactly which records failed and why, and leave no ambiguity about whether a failure was a data problem or a code bug.


The Problem

Create a new file:

touch transaction_processor.py

Run it using:

python3 transaction_processor.py
import csv

EXCHANGE_RATES = {"USD": 1.0, "EUR": 1.08, "GBP": 1.27}

with open("transactions.csv", "w") as f:
    f.write(
        "id,date,amount,currency,category\n"
        "T001,2024-01-15,250.00,USD,electronics\n"
        "T002,2024-01-16,not_a_number,EUR,clothing\n"
        "T003,2024-01-17,100.00,GBP,food\n"
        "T004,2024-01-18,,USD,travel\n"
        "T005,2024-01-19,312.00,XYZ,electronics\n"
        "T006,2024-01-20,200.00,EUR,clothing\n"
    )

def process_record(row):
    amount = float(row["amount"])
    rate   = EXCHANGE_RATES[row["currency"]]
    return {**row, "usd_amount": round(amount * rate, 2)}

def run_pipeline(path):
    try:
        f = open(path)
        reader = csv.DictReader(f)
        for row in reader:
            try:
                result = process_record(row)
                print(f"  processed {result['id']}: ${result['usd_amount']}")
            except:
                print(f"  [{row.get('id', '?')}] something went wrong")
        f.close()
    except:
        print("pipeline could not run")

run_pipeline("transactions.csv")
run_pipeline("missing.csv")


  processed T001: $250.0
  [T002] something went wrong
  processed T003: $127.0
  [T004] something went wrong
  [T005] something went wrong
  processed T006: $216.0
pipeline could not run


The pipeline runs and produces numbers for three of the six records. The other three, and the missing file call, each produce a one-line message that tells you nothing: not which rule was violated, not which field was bad, not whether the failure was a data problem or a code bug. If process_record contained a programming error — a mistyped key, a missing import, a broken import — the bare except would swallow it identically, and you would never know.

Bare except also catches KeyboardInterrupt and SystemExit. A user pressing Ctrl+C to stop the pipeline would see "something went wrong" instead of the process actually stopping.


Specific Exception Types

The first fix is naming exactly what you expect to go wrong. Replace both functions:

def process_record(row):
    try:
        amount = float(row["amount"])
    except ValueError:
        raise ValueError(f"invalid amount {row['amount']!r} — must be a number")

    if row["currency"] not in EXCHANGE_RATES:
        raise LookupError(f"unrecognised currency {row['currency']!r}")

    rate = EXCHANGE_RATES[row["currency"]]
    return {**row, "usd_amount": round(amount * rate, 2)}

def run_pipeline(path):
    try:
        f = open(path)
    except FileNotFoundError:
        print(f"file not found: {path}")
        return
    except PermissionError:
        print(f"no read permission: {path}")
        return

    reader = csv.DictReader(f)
    for row in reader:
        try:
            result = process_record(row)
            print(f"  processed {result['id']}: ${result['usd_amount']}")
        except ValueError as e:
            print(f"  [{row['id']}] validation error: {e}")
        except LookupError as e:
            print(f"  [{row['id']}] enrichment error: {e}")
    f.close()

run_pipeline("transactions.csv")
run_pipeline("missing.csv")


  processed T001: $250.0
  [T002] validation error: invalid amount 'not_a_number'  must be a number
  processed T003: $127.0
  [T004] validation error: invalid amount ''  must be a number
  [T005] enrichment error: unrecognised currency 'XYZ'
  processed T006: $216.0
file not found: missing.csv


Every failure now says what went wrong, where, and why. Programming bugs — a NameError, a TypeError from a wrong argument, a missing key that is not amount or currency — are no longer caught here, so they surface immediately as a traceback rather than being disguised as a data problem.

Why this is better

Naming the exception type is a commitment: you are saying "I know this specific failure can happen here, and I know what to do about it." Any other exception — one you did not anticipate — propagates up and asks for attention, which is exactly the right behaviour.

Note: FileNotFoundError and PermissionError are both subclasses of OSError. If you want to handle all OS-level file errors the same way, except OSError covers both. If the response to each is different — as it usually is — separate clauses are clearer.


else and finally

The try statement has two optional clauses that most Python developers underuse. else runs only when no exception was raised in the try block. finally runs always — whether an exception occurred, whether return was hit, or whether the block completed normally.

Replace run_pipeline with the following:

def run_pipeline(path):
    processed = 0
    failed    = 0

    try:
        f = open(path)
    except FileNotFoundError:
        print(f"file not found: {path}")
        return

    try:
        reader = csv.DictReader(f)
        for row in reader:
            try:
                result = process_record(row)
            except ValueError as e:
                print(f"  [{row['id']}] validation error: {e}")
                failed += 1
            except LookupError as e:
                print(f"  [{row['id']}] enrichment error: {e}")
                failed += 1
            else:
                print(f"  processed {result['id']}: ${result['usd_amount']}")
                processed += 1
    finally:
        f.close()
        print(f"  summary: {processed} processed, {failed} failed")

run_pipeline("transactions.csv")


  processed T001: $250.0
  [T002] validation error: invalid amount 'not_a_number'  must be a number
  processed T003: $127.0
  [T004] validation error: invalid amount ''  must be a number
  [T005] enrichment error: unrecognised currency 'XYZ'
  processed T006: $216.0
  summary: 3 processed, 3 failed


Why this is better

The else clause separates the success path — incrementing processed and printing the result — from the error paths. Without else, you would increment processed inside the try block, which means a partially-completed process_record that raises midway through could increment the counter before failing. The else clause only executes if the try body ran to completion without any exception at all.

The finally clause ensures f.close() is called whether the loop finishes normally, hits an unexpected exception, or encounters a return or break. Without it, an unexpected exception partway through the file would leave the file handle open.

Note: Using with open(path) as f: is the idiomatic alternative to the manual open / finally / close pattern shown above. Context managers call __exit__ — which closes the file — in the equivalent of a finally block. The explicit finally here is shown to make the guarantee visible before introducing the context manager form.


Custom Exception Hierarchy

Using built-in exceptions like ValueError and LookupError works, but it conflates your domain errors with Python's own signals. A ValueError from float() and a ValueError you raised yourself look identical to callers. Custom exceptions give your pipeline its own vocabulary.

Add the following class definitions above process_record. The hierarchy is important: a single base class lets callers catch all pipeline errors with one clause while still allowing specific handling when needed:

class PipelineError(Exception):
    pass

class ValidationError(PipelineError):
    def __init__(self, record_id, field, value, reason):
        self.record_id = record_id
        self.field     = field
        super().__init__(f"[{record_id}] {field}={value!r}{reason}")

class EnrichmentError(PipelineError):
    def __init__(self, record_id, detail):
        self.record_id = record_id
        super().__init__(f"[{record_id}] enrichment failed — {detail}")

Now replace process_record to raise these instead of built-ins:

def process_record(row):
    record_id = row["id"]

    try:
        amount = float(row["amount"])
    except ValueError:
        raise ValidationError(record_id, "amount", row["amount"], "must be a number")

    if row["currency"] not in EXCHANGE_RATES:
        raise EnrichmentError(record_id, f"unrecognised currency {row['currency']!r}")

    rate = EXCHANGE_RATES[row["currency"]]
    return {**row, "usd_amount": round(amount * rate, 2)}

And replace the inner try/except in run_pipeline to match:

            try:
                result = process_record(row)
            except ValidationError as e:
                print(f"  skipped: {e}")
                failed += 1
            except EnrichmentError as e:
                print(f"  failed:  {e}")
                failed += 1


  processed T001: $250.0
  skipped: [T002] amount='not_a_number'  must be a number
  processed T003: $127.0
  skipped: [T004] amount=''  must be a number
  failed:  [T005] enrichment failed  unrecognised currency 'XYZ'
  processed T006: $216.0
  summary: 3 processed, 3 failed


Why this is better

A caller that handles PipelineError gets all pipeline failures in one clause. A caller that needs to distinguish validation failures from enrichment failures catches them separately. No caller will ever accidentally suppress a ValueError that originated in an unrelated library — it would have to explicitly name ValidationError or PipelineError. The custom exceptions also carry structured fields like record_id and field that a generic ValueError cannot.


Exception Chaining

In the previous section, process_record raises a ValidationError when float() raises a ValueError. The original ValueError — including the message Python generated and the exact location in the call stack — is silently discarded. A developer debugging a production failure now has half the information they need.

Exception chaining with raise X from Y preserves the original exception as the cause of the new one, attaching it to the traceback:

Replace the try/except block inside process_record:

    try:
        amount = float(row["amount"])
    except ValueError as original:
        raise ValidationError(
            record_id, "amount", row["amount"], "must be a number"
        ) from original

To see the difference, temporarily add a call that triggers the error and let it propagate uncaught. Replace the run_pipeline("transactions.csv") call at the bottom with:

process_record({"id": "T002", "amount": "not_a_number", "currency": "USD"})


Traceback (most recent call last):
  File "transaction_processor.py", line 14, in process_record
    amount = float(row["amount"])
ValueError: could not convert string to float: 'not_a_number'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "transaction_processor.py", line 42, in <module>
    process_record({"id": "T002", "amount": "not_a_number", "currency": "USD"})
  File "transaction_processor.py", line 16, in process_record
    raise ValidationError(...) from original
ValidationError: [T002] amount='not_a_number'  must be a number


The traceback shows the original ValueError from Python's own float() and the ValidationError your code raised, linked by "The above exception was the direct cause of the following exception." Both the domain context and the low-level root cause are preserved.

Restore the bottom of the file:

run_pipeline("transactions.csv")

Why this is better

raise X from Y is the difference between a traceback that says "validation failed on amount" and one that says "validation failed on amount because Python's float() received the string 'notanumber' at line 14." When a production failure wakes someone up at midnight, every saved diagnostic step matters. Suppressing the cause — which is what raise X without from Y does — is only correct in the rare case where the original exception genuinely contains no useful information.

Note: To explicitly signal that the new exception is unrelated to the caught one, use raise X from None. This suppresses the "the above exception was the direct cause" message entirely. Use it only when showing the original exception would mislead.


Logging Exceptions

print statements disappear into the void in production. They go to stdout, mix with program output, and carry no timestamp, severity level, or module context. Python's logging module solves all of this — and has a method that most developers overlook: logger.exception().

Add the following imports and configuration at the top of the file, below the import csv line:

import logging

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s  %(levelname)-8s  %(message)s",
    datefmt="%H:%M:%S",
)
logger = logging.getLogger(__name__)

Now replace every print call in run_pipeline with the appropriate logging call:

def run_pipeline(path):
    processed = 0
    failed    = 0

    try:
        f = open(path)
    except FileNotFoundError:
        logger.error("file not found: %s", path)
        return

    try:
        reader = csv.DictReader(f)
        for row in reader:
            try:
                result = process_record(row)
            except ValidationError as e:
                logger.warning("skipped: %s", e)
                failed += 1
            except EnrichmentError as e:
                logger.error("enrichment failed: %s", e)
                failed += 1
            except PipelineError:
                logger.exception("unexpected pipeline error on row: %s", row.get("id"))
                failed += 1
            else:
                logger.info("processed %s: $%.2f", result["id"], result["usd_amount"])
                processed += 1
    finally:
        f.close()
        logger.info("summary: %d processed, %d failed", processed, failed)

run_pipeline("transactions.csv")


10:42:17  INFO      processed T001: $250.00
10:42:17  WARNING   skipped: [T002] amount='not_a_number'  must be a number
10:42:17  INFO      processed T003: $127.00
10:42:17  WARNING   skipped: [T004] amount=''  must be a number
10:42:17  ERROR     enrichment failed: [T005] enrichment failed  unrecognised currency 'XYZ'
10:42:17  INFO      processed T006: $216.00
10:42:17  INFO      summary: 3 processed, 3 failed


Why this is better

logger.exception() does something no other log method does: it automatically appends the full current exception traceback to the log message, without you passing the exception object explicitly. This means that if an unanticipated PipelineError subclass appears at runtime — one added by a future developer — its complete traceback is captured automatically. logger.warning() and logger.error() log the message only; use them for expected failures where the message already contains everything you need. Use logger.exception() for unexpected failures where you need the stack.

Note: logger.exception() should only be called inside an except block. Outside one, it still appends traceback information, but there is no active exception, so it will append NoneType: None — which is noisy and misleading.


contextlib: Suppress and Context Managers

Two tools from the contextlib module directly address common exception handling patterns.

contextlib.suppress is the honest version of except: pass. When you genuinely want to ignore a specific exception type — not because you forgot to handle it, but because its occurrence is a defined, expected, no-action case — suppress makes that decision explicit and visible.

Add this import at the top:

from contextlib import contextmanager, suppress
import os

Suppose the pipeline writes an audit file and should silently skip cleanup if the file was never created. Using suppress instead of except/pass:

def cleanup_audit_file(path):
    with suppress(FileNotFoundError):
        os.remove(path)
    logger.info("cleanup complete")

A reader immediately understands: the absence of this file is expected and fine. Contrast this with:

try:
    os.remove(path)
except:
    pass  # Is this intentional? A bug being hidden? Nobody knows.


@contextmanager lets you write a context manager using a generator. The yield is where the with block's body runs. Code before yield runs on entry; code after yield runs on exit; a try/finally around the yield handles the case where the body raises an exception.

Add the following pipeline context manager:

@contextmanager
def managed_pipeline(name):
    logger.info("pipeline starting: %s", name)
    stats = {"processed": 0, "failed": 0}
    try:
        yield stats
    except FileNotFoundError as e:
        logger.error("pipeline aborted — input file missing: %s", e)
    except PipelineError:
        logger.exception("pipeline aborted — unrecoverable error")
        raise
    finally:
        logger.info(
            "pipeline finished: %s%d processed, %d failed",
            name, stats["processed"], stats["failed"],
        )

Now rewrite run_pipeline to use it:

def run_pipeline(path):
    with managed_pipeline(path) as stats:
        with open(path) as f:
            reader = csv.DictReader(f)
            for row in reader:
                try:
                    result = process_record(row)
                except ValidationError as e:
                    logger.warning("skipped: %s", e)
                    stats["failed"] += 1
                except EnrichmentError as e:
                    logger.error("enrichment failed: %s", e)
                    stats["failed"] += 1
                else:
                    logger.info("processed %s: $%.2f", result["id"], result["usd_amount"])
                    stats["processed"] += 1

run_pipeline("transactions.csv")
run_pipeline("missing.csv")


10:42:17  INFO      pipeline starting: transactions.csv
10:42:17  INFO      processed T001: $250.00
10:42:17  WARNING   skipped: [T002] amount='not_a_number'  must be a number
10:42:17  INFO      processed T003: $127.00
10:42:17  WARNING   skipped: [T004] amount=''  must be a number
10:42:17  ERROR     enrichment failed: [T005] enrichment failed  unrecognised currency 'XYZ'
10:42:17  INFO      processed T006: $216.00
10:42:17  INFO      pipeline finished: transactions.csv  3 processed, 3 failed
10:42:17  INFO      pipeline starting: missing.csv
10:42:17  ERROR     pipeline aborted  input file missing: [Errno 2] No such file or directory: 'missing.csv'
10:42:17  INFO      pipeline finished: missing.csv  0 processed, 0 failed


Why this is better

managed_pipeline centralises every cross-cutting concern — startup logging, summary reporting, fatal error handling — in one place. Individual call sites only deal with row-level decisions. When the pipeline grows to read from multiple sources, each call to run_pipeline gets lifecycle logging, error boundaries, and a summary automatically. The finally block in managed_pipeline runs even when FileNotFoundError is handled, so the "pipeline finished" log line always appears — including for aborted runs where you want to know that a pipeline started and produced zero records.


Propagation: Knowing When Not to Catch

Every exception handler is a decision: "I know what this means and I know what to do." The corollary is important — if you do not know what to do, do not catch it.

The three correct responses to an exception are:

Handle it. You can recover, log and continue, return a default, or translate it into a domain error. The exception does not need to travel further up the stack.

try:
    rate = EXCHANGE_RATES[currency]
except KeyError:
    rate = 1.0  # default to USD if currency unknown — acceptable for this report

Translate and re-raise. You can add context that the current layer understands, then let the enriched exception continue upward.

try:
    amount = float(row["amount"])
except ValueError as e:
    raise ValidationError(record_id, "amount", row["amount"], "must be a number") from e

Re-raise unchanged. You need to do something — close a resource, log a warning — but you do not know how to handle the exception itself. Use raise with no argument to re-raise the exact original exception, preserving its type, message, and traceback.

try:
    result = call_external_service(payload)
except ExternalServiceError:
    metrics.increment("external_service.errors")
    raise  # let the caller decide whether this is fatal


The wrong response is catching an exception and doing nothing useful with it — no re-raise, no meaningful log, no recovery. Replace the except clause in managed_pipeline with the following to see what happens when a fatal error is silently swallowed instead of re-raised:

    except PipelineError:
        logger.exception("pipeline aborted — unrecoverable error")
        # raise is missing here — the exception disappears

Then introduce a fatal error by adding a bad row to the sample data and raising a PipelineError directly inside process_record:

    if row.get("id") == "T003":
        raise PipelineError("database connection lost")

Run the pipeline and observe: the log shows the error, but execution continues on T004, T005, T006 — processing records against a database that has declared itself unavailable. The pipeline produces partial output that is filed as complete. Without raise, a fatal signal becomes invisible to every layer above.

Restore process_record to its previous state and restore the raise in managed_pipeline before continuing.

Why this matters

Silent swallowing is the most dangerous form of exception misuse because it produces no visible signal. The program appears to run successfully, and the corruption or data loss it causes is discovered later — sometimes much later — when the symptoms are far removed from the cause. If you do not know what to do with an exception, re-raise it.


Summary

Python's exception handling model is complete enough to handle every failure scenario a real pipeline encounters. In this tutorial we transformed a pipeline that hid every error into one that handles failures precisely and transparently:

  • Bare except catches KeyboardInterrupt, SystemExit, and programming bugs alongside data errors — always name the exception type you expect
  • except FileNotFoundError and except PermissionError as separate clauses produce different responses to different OS failures; except OSError covers both when the response is the same
  • else runs only when no exception occurred — use it for code that should only execute on success, not in the try block itself where a partial execution could reach it before raising
  • finally runs unconditionally — it is the right place for cleanup that must happen regardless of outcome
  • A custom exception hierarchy gives the pipeline its own vocabulary; callers can catch all pipeline errors with the base class or specific subclasses for finer control
  • raise X from Y attaches the original exception as the cause, preserving both the domain context and the root cause in the traceback
  • logger.exception() captures the full traceback automatically — use it for unexpected failures; use logger.warning() and logger.error() for expected ones where the message contains everything needed
  • contextlib.suppress makes intentional exception silencing explicit and honest; bare except: pass is never the right form of the same idea
  • @contextmanager with a try/yield/finally concentrates lifecycle concerns — startup, cleanup, summary — in one reusable place
  • Re-raise with bare raise when you need to perform a side effect but cannot handle the exception; never swallow an exception you do not know how to recover from

Войдите в систему, чтобы получить доступ к лаборатории и работать с кодом из этого урока.

Войти