Skip to content

Stress Testing Specification

Pipeline integrity tests at scale (10K–100K exposures) across all four framework/permission combinations. These tests verify that the calculator produces correct, complete, and stable output under realistic portfolio sizes.

Test Group: STRESS


Requirements Status

ID Requirement Priority Status
NFR-1 Row count preservation — no silent data loss P0 Done
NFR-2 Column completeness — all required output columns present P0 Done
NFR-3 Numerical stability — no NaN, Inf, or negative RWA P0 Done
NFR-4 Risk weight bounds — SA within [0%, 1250%], IRB non-negative P0 Done
NFR-5 Approach routing — correct SA/IRB assignment P0 Done
NFR-6 Exposure class coverage — all entity types produce output P0 Done
NFR-7 Output floor at scale — PRA transitional floor works at portfolio level P0 Done
NFR-8 Error accumulation — bounded error list, no pipeline failure P1 Done
NFR-9 Summary consistency — aggregates match detail rows P1 Done
NFR-10 EAD consistency — non-negative, non-null, positive sum P1 Done
NFR-11 Determinism — identical inputs produce identical outputs P0 Done
NFR-12 Framework differentiation — B31 produces different RWA from CRR P1 Done
NFR-13 Large-scale (100K) — row preservation, stability, memory bound P2 Done
NFR-14 Exposure reference uniqueness — no duplicate references in output P1 Done

Test Infrastructure

Synthetic Data Generation

Stress tests use fully synthetic datasets generated by tests/acceptance/stress/conftest.py:

Dataset Counterparties Loans Contingents Seed
stress_dataset_10k 10,000 ~30,000 ~5,000 42
stress_dataset_100k 100,000 ~300,000 ~50,000 99

Entity type distribution: corporate (35%), individual (30%), institution (15%), sovereign (10%), specialised lending (10%).

Reporting date is fixed at 2028-01-01 so the Basel 3.1 output floor is active (PRA transitional starts 2027).

Framework/Permission Combinations

All tests run against four pre-computed session-scoped fixtures:

Fixture Framework Permission Mode
crr_sa_result_10k CRR Standardised
crr_irb_result_10k CRR IRB
b31_sa_result_10k Basel 3.1 Standardised
b31_irb_result_10k Basel 3.1 IRB

Scenario Groups

STRESS-1: Row Count Preservation (8 tests)

Purpose: Every input exposure must produce exactly one output row. Silent data loss from join failures or filter errors is the most dangerous pipeline bug — it produces systematically understated capital.

Test Framework Description
STRESS-1.1 CRR SA All input loans appear in output
STRESS-1.2 CRR SA All input contingents appear in output
STRESS-1.3 CRR SA No unknown exposure types in output
STRESS-1.4 CRR IRB All input loans appear in output
STRESS-1.5 B31 SA All input loans appear in output
STRESS-1.6 B31 IRB All input loans appear in output
STRESS-1.7 B31 IRB All input contingents appear in output
STRESS-1.8 Any Committed facilities generate facility_undrawn rows

Validation: Output row count per exposure type matches input count. Exposure types confined to {loan, contingent, facility_undrawn}.

STRESS-2: Column Completeness (4 tests)

Purpose: All required analytical columns must be present. Missing columns break downstream COREP reporting and Pillar III disclosures.

Required columns: exposure_reference, exposure_class, risk_weight, ead_final, rwa_final, approach_applied.

Test Framework Description
STRESS-2.1 CRR SA All required output columns present
STRESS-2.2 CRR IRB All required output columns present
STRESS-2.3 B31 SA All required output columns present
STRESS-2.4 B31 IRB All required output columns present

STRESS-3: Numerical Stability (10 tests)

Purpose: At scale, floating-point accumulation errors, NaN propagation from edge cases, and Inf from division-by-zero are most likely to manifest.

Test Framework Description
STRESS-3.1 CRR SA No NaN in rwa_final
STRESS-3.2 B31 IRB No NaN in rwa_final
STRESS-3.3 CRR SA No Inf in rwa_final
STRESS-3.4 CRR SA No negative RWA
STRESS-3.5 CRR SA Total RWA sum is finite and positive
STRESS-3.6 B31 IRB Total RWA sum is finite and positive
STRESS-3.7 CRR SA No null rwa_final
STRESS-3.8 CRR SA No null risk_weight
STRESS-3.9 CRR SA No NaN in ead_final
STRESS-3.10 B31 IRB No negative RWA

STRESS-4: Risk Weight Bounds (4 tests)

Purpose: Risk weights must stay within regulatory bounds. SA: [0%, 1250%] per CRR Art. 114–134 / PRA PS1/26 Art. 112–134. IRB: non-negative per Art. 153.

Test Framework Description Bounds
STRESS-4.1 CRR SA SA risk weights within regulatory bounds 0% ≤ RW ≤ 1250%
STRESS-4.2 B31 SA SA risk weights within regulatory bounds 0% ≤ RW ≤ 1250%
STRESS-4.3 CRR IRB IRB risk weights non-negative RW ≥ 0%
STRESS-4.4 B31 IRB IRB risk weights non-negative RW ≥ 0%

STRESS-5: Approach Distribution (5 tests)

Purpose: Verify exposures are routed to the correct calculation approach. Misrouting at scale produces systematically wrong capital numbers.

Test Framework Description
STRESS-5.1 CRR SA All exposures use SA-family approaches (standardised, equity, slotting)
STRESS-5.2 CRR IRB Some exposures use IRB approaches (foundation_irb, advanced_irb)
STRESS-5.3 CRR IRB IRB-routed exposures have positive RWA (not zero from miscalculation)
STRESS-5.4 CRR SA Sum of per-approach counts equals total row count
STRESS-5.5 B31 IRB B31 IRB mode has mixed approaches (SA + IRB + potentially slotting)

STRESS-6: Exposure Class Coverage (4 tests)

Purpose: All expected exposure classes appear in output. The synthetic data covers corporate, retail, institution, sovereign, and specialised lending — all must produce correctly assigned output rows.

Test Framework Description
STRESS-6.1 CRR SA At least 3 distinct exposure classes in output
STRESS-6.2 B31 SA At least 3 distinct exposure classes in output
STRESS-6.3 CRR SA Corporate exposure class present (35% of input entities)
STRESS-6.4 CRR SA Retail exposure class present (30% of input entities)

STRESS-7: Output Floor at Scale (7 tests)

Purpose: The Basel 3.1 output floor (PRA PS1/26 Art. 92 para 2A–5) operates at portfolio level using U-TREA and S-TREA aggregated across all approaches. This must work correctly at 10K+ exposures.

Test Framework Description
STRESS-7.1 B31 IRB Output floor summary exists
STRESS-7.2 B31 IRB U-TREA (unweighted) is positive
STRESS-7.3 B31 IRB S-TREA (standardised) is positive
STRESS-7.4 B31 IRB Floor percentage within [50%, 72.5%]
STRESS-7.5 B31 IRB Post-floor RWA ≥ U-TREA (floor can only increase capital)
STRESS-7.6 CRR IRB No output floor summary (CRR has no output floor)
STRESS-7.7 B31 SA SA-only mode: output floor shortfall is zero (no IRB to floor)

STRESS-8: Error Accumulation (4 tests)

Purpose: At scale, data quality issues accumulate. The error list must not grow unboundedly (memory) or cause pipeline failure (correctness). Errors are per-class or per-column, not per-row.

Test Framework Description
STRESS-8.1 CRR SA Errors is a list (not None)
STRESS-8.2 CRR SA Error count < 1,000 (bounded, not per-row)
STRESS-8.3 CRR SA Pipeline produces results even with DQ errors
STRESS-8.4 B31 IRB Error count < 1,000

STRESS-9: Summary Consistency (2 tests)

Purpose: Summary aggregates (total RWA by class, total by approach) must match the detailed per-exposure results. Discrepancies indicate aggregation bugs.

Test Framework Description Tolerance
STRESS-9.1 CRR SA summary_by_class RWA total ≈ results total rel=1%
STRESS-9.2 CRR SA summary_by_approach covers all approaches in results exact

STRESS-10: EAD Consistency (4 tests)

Purpose: EAD drives RWA. At scale, CCF miscalculation or join errors can produce zero EAD or wildly inflated EAD for off-balance-sheet items.

Test Framework Description
STRESS-10.1 CRR SA No negative EAD
STRESS-10.2 CRR SA Total EAD is positive
STRESS-10.3 CRR SA No null ead_final
STRESS-10.4 B31 IRB No NaN in ead_final

STRESS-11: Determinism (1 test)

Purpose: Non-determinism from hash ordering, parallel execution, or floating-point reordering means results cannot be audited. Two identical runs must produce identical totals.

Test Framework Description Tolerance
STRESS-11.1 CRR SA Two runs produce identical RWA totals rel=1e-10

STRESS-12: Framework Comparison (1 test)

Purpose: Basel 3.1 introduces higher SA weights (equity 250%, currency mismatch 1.5×) and the output floor. At scale these should manifest as measurable differences from CRR.

Test Framework Description Tolerance
STRESS-12.1 CRR SA vs B31 SA B31 SA total RWA differs from CRR SA by at least 1% rel=1%

STRESS-13: Large Scale 100K (4 tests, @pytest.mark.slow)

Purpose: Some bugs only manifest at true scale — hash collisions in joins, memory pressure causing silent truncation, or O(n²) operations. These are excluded from normal test runs.

Test Framework Description Bound
STRESS-13.1 CRR SA All input loans preserved at 300K+ exposure scale count match
STRESS-13.2 CRR SA No NaN or Inf in RWA at 100K scale zero violations
STRESS-13.3 CRR SA Pipeline peak memory stays under 4 GB tracemalloc < 4,000 MB
STRESS-13.4 B31 IRB Output floor produces valid summary at 100K scale summary non-null, values positive

STRESS-14: Exposure Reference Uniqueness (2 tests)

Purpose: Duplicate exposure_reference values cause double-counting in COREP aggregations and misleading exposure-level audit trails.

Test Framework Description
STRESS-14.1 CRR SA All exposure_reference values unique in output
STRESS-14.2 B31 IRB All exposure_reference values unique in output

Acceptance Tests

Group Tests Pass Rate
STRESS-1: Row Count Preservation 8 100%
STRESS-2: Column Completeness 4 100%
STRESS-3: Numerical Stability 10 100%
STRESS-4: Risk Weight Bounds 4 100%
STRESS-5: Approach Distribution 5 100%
STRESS-6: Exposure Class Coverage 4 100%
STRESS-7: Output Floor at Scale 7 100%
STRESS-8: Error Accumulation 4 100%
STRESS-9: Summary Consistency 2 100%
STRESS-10: EAD Consistency 4 100%
STRESS-11: Determinism 1 100%
STRESS-12: Framework Comparison 1 100%
STRESS-13: Large Scale 100K (slow) 4 100%
STRESS-14: Reference Uniqueness 2 100%
Total 60 100%

STRESS-13 Exclusion

The 4 large-scale tests (STRESS-13) are marked @pytest.mark.slow and excluded from the standard test run (--benchmark-skip). Run explicitly with uv run pytest tests/acceptance/stress/ -m slow.