Stress Testing Specification¶
Pipeline integrity tests at scale (10K–100K exposures) across all four framework/permission combinations. These tests verify that the calculator produces correct, complete, and stable output under realistic portfolio sizes.
Test Group: STRESS
Requirements Status¶
| ID | Requirement | Priority | Status |
|---|---|---|---|
| NFR-1 | Row count preservation — no silent data loss | P0 | Done |
| NFR-2 | Column completeness — all required output columns present | P0 | Done |
| NFR-3 | Numerical stability — no NaN, Inf, or negative RWA | P0 | Done |
| NFR-4 | Risk weight bounds — SA within [0%, 1250%], IRB non-negative | P0 | Done |
| NFR-5 | Approach routing — correct SA/IRB assignment | P0 | Done |
| NFR-6 | Exposure class coverage — all entity types produce output | P0 | Done |
| NFR-7 | Output floor at scale — PRA transitional floor works at portfolio level | P0 | Done |
| NFR-8 | Error accumulation — bounded error list, no pipeline failure | P1 | Done |
| NFR-9 | Summary consistency — aggregates match detail rows | P1 | Done |
| NFR-10 | EAD consistency — non-negative, non-null, positive sum | P1 | Done |
| NFR-11 | Determinism — identical inputs produce identical outputs | P0 | Done |
| NFR-12 | Framework differentiation — B31 produces different RWA from CRR | P1 | Done |
| NFR-13 | Large-scale (100K) — row preservation, stability, memory bound | P2 | Done |
| NFR-14 | Exposure reference uniqueness — no duplicate references in output | P1 | Done |
Test Infrastructure¶
Synthetic Data Generation¶
Stress tests use fully synthetic datasets generated by tests/acceptance/stress/conftest.py:
| Dataset | Counterparties | Loans | Contingents | Seed |
|---|---|---|---|---|
stress_dataset_10k |
10,000 | ~30,000 | ~5,000 | 42 |
stress_dataset_100k |
100,000 | ~300,000 | ~50,000 | 99 |
Entity type distribution: corporate (35%), individual (30%), institution (15%), sovereign (10%), specialised lending (10%).
Reporting date is fixed at 2028-01-01 so the Basel 3.1 output floor is active (PRA transitional starts 2027).
Framework/Permission Combinations¶
All tests run against four pre-computed session-scoped fixtures:
| Fixture | Framework | Permission Mode |
|---|---|---|
crr_sa_result_10k |
CRR | Standardised |
crr_irb_result_10k |
CRR | IRB |
b31_sa_result_10k |
Basel 3.1 | Standardised |
b31_irb_result_10k |
Basel 3.1 | IRB |
Scenario Groups¶
STRESS-1: Row Count Preservation (8 tests)¶
Purpose: Every input exposure must produce exactly one output row. Silent data loss from join failures or filter errors is the most dangerous pipeline bug — it produces systematically understated capital.
| Test | Framework | Description |
|---|---|---|
| STRESS-1.1 | CRR SA | All input loans appear in output |
| STRESS-1.2 | CRR SA | All input contingents appear in output |
| STRESS-1.3 | CRR SA | No unknown exposure types in output |
| STRESS-1.4 | CRR IRB | All input loans appear in output |
| STRESS-1.5 | B31 SA | All input loans appear in output |
| STRESS-1.6 | B31 IRB | All input loans appear in output |
| STRESS-1.7 | B31 IRB | All input contingents appear in output |
| STRESS-1.8 | Any | Committed facilities generate facility_undrawn rows |
Validation: Output row count per exposure type matches input count. Exposure types confined to {loan, contingent, facility_undrawn}.
STRESS-2: Column Completeness (4 tests)¶
Purpose: All required analytical columns must be present. Missing columns break downstream COREP reporting and Pillar III disclosures.
Required columns: exposure_reference, exposure_class, risk_weight, ead_final, rwa_final, approach_applied.
| Test | Framework | Description |
|---|---|---|
| STRESS-2.1 | CRR SA | All required output columns present |
| STRESS-2.2 | CRR IRB | All required output columns present |
| STRESS-2.3 | B31 SA | All required output columns present |
| STRESS-2.4 | B31 IRB | All required output columns present |
STRESS-3: Numerical Stability (10 tests)¶
Purpose: At scale, floating-point accumulation errors, NaN propagation from edge cases, and Inf from division-by-zero are most likely to manifest.
| Test | Framework | Description |
|---|---|---|
| STRESS-3.1 | CRR SA | No NaN in rwa_final |
| STRESS-3.2 | B31 IRB | No NaN in rwa_final |
| STRESS-3.3 | CRR SA | No Inf in rwa_final |
| STRESS-3.4 | CRR SA | No negative RWA |
| STRESS-3.5 | CRR SA | Total RWA sum is finite and positive |
| STRESS-3.6 | B31 IRB | Total RWA sum is finite and positive |
| STRESS-3.7 | CRR SA | No null rwa_final |
| STRESS-3.8 | CRR SA | No null risk_weight |
| STRESS-3.9 | CRR SA | No NaN in ead_final |
| STRESS-3.10 | B31 IRB | No negative RWA |
STRESS-4: Risk Weight Bounds (4 tests)¶
Purpose: Risk weights must stay within regulatory bounds. SA: [0%, 1250%] per CRR Art. 114–134 / PRA PS1/26 Art. 112–134. IRB: non-negative per Art. 153.
| Test | Framework | Description | Bounds |
|---|---|---|---|
| STRESS-4.1 | CRR SA | SA risk weights within regulatory bounds | 0% ≤ RW ≤ 1250% |
| STRESS-4.2 | B31 SA | SA risk weights within regulatory bounds | 0% ≤ RW ≤ 1250% |
| STRESS-4.3 | CRR IRB | IRB risk weights non-negative | RW ≥ 0% |
| STRESS-4.4 | B31 IRB | IRB risk weights non-negative | RW ≥ 0% |
STRESS-5: Approach Distribution (5 tests)¶
Purpose: Verify exposures are routed to the correct calculation approach. Misrouting at scale produces systematically wrong capital numbers.
| Test | Framework | Description |
|---|---|---|
| STRESS-5.1 | CRR SA | All exposures use SA-family approaches (standardised, equity, slotting) |
| STRESS-5.2 | CRR IRB | Some exposures use IRB approaches (foundation_irb, advanced_irb) |
| STRESS-5.3 | CRR IRB | IRB-routed exposures have positive RWA (not zero from miscalculation) |
| STRESS-5.4 | CRR SA | Sum of per-approach counts equals total row count |
| STRESS-5.5 | B31 IRB | B31 IRB mode has mixed approaches (SA + IRB + potentially slotting) |
STRESS-6: Exposure Class Coverage (4 tests)¶
Purpose: All expected exposure classes appear in output. The synthetic data covers corporate, retail, institution, sovereign, and specialised lending — all must produce correctly assigned output rows.
| Test | Framework | Description |
|---|---|---|
| STRESS-6.1 | CRR SA | At least 3 distinct exposure classes in output |
| STRESS-6.2 | B31 SA | At least 3 distinct exposure classes in output |
| STRESS-6.3 | CRR SA | Corporate exposure class present (35% of input entities) |
| STRESS-6.4 | CRR SA | Retail exposure class present (30% of input entities) |
STRESS-7: Output Floor at Scale (7 tests)¶
Purpose: The Basel 3.1 output floor (PRA PS1/26 Art. 92 para 2A–5) operates at portfolio level using U-TREA and S-TREA aggregated across all approaches. This must work correctly at 10K+ exposures.
| Test | Framework | Description |
|---|---|---|
| STRESS-7.1 | B31 IRB | Output floor summary exists |
| STRESS-7.2 | B31 IRB | U-TREA (unweighted) is positive |
| STRESS-7.3 | B31 IRB | S-TREA (standardised) is positive |
| STRESS-7.4 | B31 IRB | Floor percentage within [50%, 72.5%] |
| STRESS-7.5 | B31 IRB | Post-floor RWA ≥ U-TREA (floor can only increase capital) |
| STRESS-7.6 | CRR IRB | No output floor summary (CRR has no output floor) |
| STRESS-7.7 | B31 SA | SA-only mode: output floor shortfall is zero (no IRB to floor) |
STRESS-8: Error Accumulation (4 tests)¶
Purpose: At scale, data quality issues accumulate. The error list must not grow unboundedly (memory) or cause pipeline failure (correctness). Errors are per-class or per-column, not per-row.
| Test | Framework | Description |
|---|---|---|
| STRESS-8.1 | CRR SA | Errors is a list (not None) |
| STRESS-8.2 | CRR SA | Error count < 1,000 (bounded, not per-row) |
| STRESS-8.3 | CRR SA | Pipeline produces results even with DQ errors |
| STRESS-8.4 | B31 IRB | Error count < 1,000 |
STRESS-9: Summary Consistency (2 tests)¶
Purpose: Summary aggregates (total RWA by class, total by approach) must match the detailed per-exposure results. Discrepancies indicate aggregation bugs.
| Test | Framework | Description | Tolerance |
|---|---|---|---|
| STRESS-9.1 | CRR SA | summary_by_class RWA total ≈ results total |
rel=1% |
| STRESS-9.2 | CRR SA | summary_by_approach covers all approaches in results |
exact |
STRESS-10: EAD Consistency (4 tests)¶
Purpose: EAD drives RWA. At scale, CCF miscalculation or join errors can produce zero EAD or wildly inflated EAD for off-balance-sheet items.
| Test | Framework | Description |
|---|---|---|
| STRESS-10.1 | CRR SA | No negative EAD |
| STRESS-10.2 | CRR SA | Total EAD is positive |
| STRESS-10.3 | CRR SA | No null ead_final |
| STRESS-10.4 | B31 IRB | No NaN in ead_final |
STRESS-11: Determinism (1 test)¶
Purpose: Non-determinism from hash ordering, parallel execution, or floating-point reordering means results cannot be audited. Two identical runs must produce identical totals.
| Test | Framework | Description | Tolerance |
|---|---|---|---|
| STRESS-11.1 | CRR SA | Two runs produce identical RWA totals | rel=1e-10 |
STRESS-12: Framework Comparison (1 test)¶
Purpose: Basel 3.1 introduces higher SA weights (equity 250%, currency mismatch 1.5×) and the output floor. At scale these should manifest as measurable differences from CRR.
| Test | Framework | Description | Tolerance |
|---|---|---|---|
| STRESS-12.1 | CRR SA vs B31 SA | B31 SA total RWA differs from CRR SA by at least 1% | rel=1% |
STRESS-13: Large Scale 100K (4 tests, @pytest.mark.slow)¶
Purpose: Some bugs only manifest at true scale — hash collisions in joins, memory pressure causing silent truncation, or O(n²) operations. These are excluded from normal test runs.
| Test | Framework | Description | Bound |
|---|---|---|---|
| STRESS-13.1 | CRR SA | All input loans preserved at 300K+ exposure scale | count match |
| STRESS-13.2 | CRR SA | No NaN or Inf in RWA at 100K scale | zero violations |
| STRESS-13.3 | CRR SA | Pipeline peak memory stays under 4 GB | tracemalloc < 4,000 MB |
| STRESS-13.4 | B31 IRB | Output floor produces valid summary at 100K scale | summary non-null, values positive |
STRESS-14: Exposure Reference Uniqueness (2 tests)¶
Purpose: Duplicate exposure_reference values cause double-counting in COREP aggregations and misleading exposure-level audit trails.
| Test | Framework | Description |
|---|---|---|
| STRESS-14.1 | CRR SA | All exposure_reference values unique in output |
| STRESS-14.2 | B31 IRB | All exposure_reference values unique in output |
Acceptance Tests¶
| Group | Tests | Pass Rate |
|---|---|---|
| STRESS-1: Row Count Preservation | 8 | 100% |
| STRESS-2: Column Completeness | 4 | 100% |
| STRESS-3: Numerical Stability | 10 | 100% |
| STRESS-4: Risk Weight Bounds | 4 | 100% |
| STRESS-5: Approach Distribution | 5 | 100% |
| STRESS-6: Exposure Class Coverage | 4 | 100% |
| STRESS-7: Output Floor at Scale | 7 | 100% |
| STRESS-8: Error Accumulation | 4 | 100% |
| STRESS-9: Summary Consistency | 2 | 100% |
| STRESS-10: EAD Consistency | 4 | 100% |
| STRESS-11: Determinism | 1 | 100% |
| STRESS-12: Framework Comparison | 1 | 100% |
| STRESS-13: Large Scale 100K (slow) | 4 | 100% |
| STRESS-14: Reference Uniqueness | 2 | 100% |
| Total | 60 | 100% |
STRESS-13 Exclusion
The 4 large-scale tests (STRESS-13) are marked @pytest.mark.slow and excluded from
the standard test run (--benchmark-skip). Run explicitly with uv run pytest tests/acceptance/stress/ -m slow.