CRM, MOFs, and Other Edge-Case Archaeology¶
Most regulatory bugs do not look like bugs. They look like reasonable code computing the right number on the wrong unit. Four war stories from the changelog.
Published 2026-07-07. Code references are pinned to commit cceaee4.
This is post 6 in the series on building this UK Basel 3.1 RWA calculator. Posts 1–5 covered why the calculator exists, how its pipeline is shaped, what the Standardised Approach actually does, how the codebase gets written, and what the output floor does. This post is a tour through the bits of the implementation that took disproportionate effort: regulatory rules where the engine produced reasonable-looking numbers and was, on careful reading, completely wrong.
The changelog has on the order of sixty closed items at the time of writing. Most are one or two lines: a missing field, a wrong constant, a misclassified entity type. A smaller set — maybe a dozen — took weeks of work each, with cross-cutting changes spanning the hierarchy resolver, classifier, CRM processor, and calculators. The pattern across them is consistent: someone reads the regulatory text, notices that the implementation is operating on the wrong unit (per-counterparty when the rule says group-of-connected-clients, per-row when the rule says pool-aware, on the guarantor's approach when the rule says the protection slice's approach), and then has to push a structural change through every stage that touched the misread quantity. Each of the four stories below is one of those.
This is the post that makes the regulatory case for the architectural choices in post 2. Frozen bundles, error accumulation, and the data/engine split do not exist because they look pretty. They exist because the wrong-unit bugs that follow only become visible — and only become fixable — when the data flow between stages is explicit and traceable.
War story 1: Multiple Option Facilities and Facility Shares¶
A Multiple Option Facility (MOF) is a single committed credit line under which the borrower can draw funds under any of several sub-facilities — a term loan, a revolving credit, a letter of credit, a swingline. Each sub-facility carries its own off-balance commitment treatment (its own SA Credit Conversion Factor under PS1/26 Art. 111 Table A1, or CRR Art. 166). The bank holds a single legal commitment of, say, £50m, but the borrower's drawdown profile is at the borrower's discretion.
Until version 0.2.3, the calculator's hierarchy resolver was generating a synthetic facility_undrawn exposure row for each parent facility using the parent's own risk_type — which for an MOF is typically the legally-classified parent type (often LR — line of credit revolver — at 0% CCF). The descendants, which were the actual commitment products the borrower could draw against, were ignored when allocating undrawn EAD. A £50m MOF with a 0%-CCF parent and a £30m letter-of-credit sub-facility at, say, 50% CCF would generate £0 of undrawn EAD on the synthetic row — a £15m commitment off the books.
The regulatory point is straightforward once stated: an MOF exposes the bank to the worst-case off-balance commitment among its sub-facilities, because the borrower will draw under whichever sub-facility most suits them. The undrawn EAD on the parent must reflect the highest-CCF descendant, not the legally-classified parent type. A bank that books £50m of LR commitment and ignores its higher-CCF children is understating capital under any reasonable read of CRR Art. 111 / PS1/26 Art. 111 Table A1.
The fix had two parts. First, any facility with at least one child_type='facility' row in facility_mappings is now treated as an MOF: a new private method _derive_mof_risk_type() walks the descendants at any depth, computes the SA CCF for each descendant's risk_type using the same frame-aware expression as the rest of the calculator (CRR or PS1/26 Table A1), and overrides the parent's undrawn risk_type to the descendant whose CCF is highest. Tie-break is alphabetical lowercase risk_type then alphabetical descendant facility_reference for full reproducibility.
Second — and this was the part that took the rest of the effort — Facility Shares. When the descendants under a facility reference more than one distinct counterparty_reference, the undrawn allocation can no longer go to the facility's own counterparty_reference. A new _derive_facility_share_counterparty() method collects the union of distinct counterparties under the descendant set, looks each one up against the resolved counterparty entity_type and cqs, and computes a preview SA risk weight for each candidate. The candidate with the highest preview RW wins. The chosen counterparty still flows through the full classifier and SA/IRB pipeline downstream, so the preview is non-binding — its only job is to pick the worst-case member at allocation time.
Both overrides are skipped no-ops when their inputs are trivial. A facility with no child_type='facility' rows is not an MOF; a facility with one or fewer distinct member counterparties is not a share. Two new audit columns flow on the facility_undrawn rows for traceability — original_counterparty_reference and mof_risk_type_source — so an auditor can ask the calculator "which descendant's risk type drove this synthetic row's CCF?" and get a deterministic answer. Six new unit tests pin the behaviour, including a combined scenario where MOF and Facility Share both apply on the same parent.
What this fix demonstrates: the data model already knew about parent-child facility relationships from the original hierarchy resolver. The regulatory rule was not visible to the implementation because no one had asked, at allocation time, whether the parent was the legal commitment type or the binding commitment type. Both questions have the same name on the schema. They are, regulatorily, different questions.
War story 2: AIRB own-LGD and the anti-double-counting rule¶
Under the Advanced IRB approach, the bank's own modelled LGD already reflects the credit-risk-mitigating effect of any collateral incorporated into the model. CRR Art. 181 and PS1/26 Art. 169A are explicit on this point: a firm using its own modelled LGD must not additionally allocate the same collateral to non-AIRB exposures of the same counterparty, because the AIRB LGD has already absorbed that collateral's recovery effect. Doing so would supervisorily double-count the protection.
The original calculator's CRM allocator did not honour this. Collateral pledged at the counterparty level (for example, a parent guarantee or a master collateral pool) was allocated pro-rata across all of that counterparty's exposures, regardless of approach. A counterparty with both A-IRB exposures (where the modelled LGD already reflected the collateral) and SA or F-IRB exposures (where it did not) would have the same collateral count twice: once inside the AIRB LGD, and once again as a pro-rata allocation reducing EAD on the non-AIRB rows.
The fix in version 0.2.0 was a structural change to the CRM allocator. A new optional Boolean column is_airb_model_collateral was added to COLLATERAL_SCHEMA, defaulting to False. The allocator (_apply_collateral_unified in engine/crm/collateral.py) became pool-aware: every exposure is partitioned at the start of apply_collateral into an AIRB pool (rows where the modelled LGD is preserved by CRM — approach == AIRB AND not falling back to the supervisory formula under Foundation election or Art. 169B insufficient-data) and a non-AIRB pool (FIRB / SA / Slotting and AIRB rows that use the formula). The collateral processor's exposure-aggregate lookups (_ead_facility_airb / _ead_facility_non_airb, _ead_cp_airb / _ead_cp_non_airb) split each EAD denominator into pool-specific variants, and the group-by aggregation splits each metric into _n (unflagged collateral) and _a (flagged) using a filter on is_airb_model_collateral. The pro-rata weights _fw_n / _fw_a / _cw_n / _cw_a bake in a pool-match gate so that flagged collateral routes only to AIRB-pool rows and unflagged collateral routes only to non-AIRB rows.
The behavioural change: flagged collateral that the firm asserts has been used to build its A-IRB LGD model is allocated only to AIRB exposures (where it has no further effect, because the modelled LGD already reflects it). Unflagged collateral — collateral that has not been used in the LGD model — is allocated only to non-AIRB exposures, where it correctly reduces EAD. A new error code CRM006 (ERROR_AIRB_MODEL_COLLATERAL_MISDIRECTED) is emitted as a data-quality warning when direct (exposure-pledged) flagged collateral is misdirected onto a non-AIRB exposure; the misdirected row is given zero allocation rather than silently double-counting.
What makes this fix interesting is that it is visibly anti-conservative for some firms and visibly pro-conservative for others. A bank running a homogeneous A-IRB book gets exactly the same numbers it had before. A bank running a mixed A-IRB / non-AIRB book under the old allocator was understating capital on its non-AIRB exposures (counterparty-level collateral was being wastefully allocated to AIRB rows where it did nothing, leaving the non-AIRB rows undercollateralised in the pro-rata calculation). Under the new allocator, those non-AIRB rows pick up a larger share of the unflagged counterparty-level collateral and produce lower RWA. Whether the net effect across the book is pro- or anti-conservative depends entirely on the firm's mix.
The new column's default is False. Fixtures and test schemas that don't carry the column are backfilled to legacy behaviour: AIRB total = 0, non-AIRB total = full total. No acceptance goldens shifted on the merge because no existing scenario combined mixed A-IRB / non-AIRB exposures with counterparty- or facility-level collateral. The contract tests that did shift were the new ones, written before the implementation.
War story 3: The SME supporting factor and the wrong unit of "obligor"¶
The SME supporting factor under CRR Art. 501 reduces the RWA of qualifying SME exposures by 23.81% (the SME factor of 0.7619) below the EUR 1.5m exposure threshold and by 15% (the factor of 0.85) above it. The threshold — what Art. 501 calls E* — is in principle straightforward: total exposure to the obligor up to GBP 1.5m gets the deeper discount, anything above gets the shallower.
The catch is the definition of "obligor". Art. 4(1)(39) of CRR (and PS1/26's equivalent) defines a "group of connected clients" as the unit on which credit risk concentration is measured, and Art. 501 reads E* against this unit, not against the individual loan. A counterparty that belongs to a corporate group with five other connected obligors must aggregate its exposures with the group before the E* threshold is applied.
Until version 0.2.0, the calculator was evaluating E* per-counterparty, ignoring connected-client aggregation. A counterparty with three GBP 600,000 loans and a parent group whose total connected-client exposure was GBP 8m would still receive the deeper SME discount on each individual loan, because the per-row check passed (each row's exposure was below GBP 1.5m). The fix replaced the per-row test with an aggregation across the full connected-client group before threshold evaluation — which the hierarchy resolver was already producing as lending_group_totals for the retail Art. 123(c) granularity check covered in post 3.
The architectural point is the small one: two different regulatory rules — the SME supporting factor under Art. 501, and the retail granularity threshold under Art. 123(c) / Art. 123A — both depend on the same primitive (group-of-connected-clients aggregate exposure). Once the hierarchy resolver was producing that primitive correctly for retail, applying it to SME was a single expression change. Before the hierarchy resolver was producing it correctly, both rules were broken in subtly different ways.
The regulatory point is the larger one: every Basel rule that contains the words "group of connected clients" is loadbearing on the firm's mapping data. A firm with incomplete or stale connected-client linkages will produce plausible-looking RWA numbers that are quietly wrong on every group-aggregated rule simultaneously — SME supporting factor, retail granularity, large-exposure limits, and the connected-client portion of expected loss. This is the kind of error the architecture cannot detect; only the data quality team can. The calculator surfaces the dependency through the explicit lending_group_totals LazyFrame on the hierarchy bundle. Whether the totals are correctly populated is upstream.
War story 4: Cross-approach CCF substitution¶
A guarantee changes the obligor against whom the bank's exposure is measured for capital purposes. Under both CRR and PS1/26, a guarantee from a more-creditworthy guarantor lets the bank substitute the guarantor's risk weight (or, under IRB, the guarantor's PD and LGD) on the protected slice of the exposure. This is the well-known "substitution" effect.
The less-well-known effect is on the credit conversion factor for off-balance-sheet portions. Under CRR Art. 161(3) and the PS1/26 equivalent, when a guarantee is provided to an IRB-modelled exposure by a guarantor who would themselves be on the SA, the guaranteed portion of the off-balance commitment must use the SA CCF for the protection slice — not the IRB CCF on the unguaranteed slice. The CCF used to compute EAD therefore changes mid-exposure, depending on which slice you are looking at. The original exposure's CCF survives on the unguaranteed portion; the guaranteed portion is recomputed at the SA CCF.
The implementation is split across several files. The reusable expression sa_ccf_expression() lives in engine/ccf.py and produces the framework-aware SA CCF (CRR Art. 111 / PS1/26 Art. 111 Table A1). The _apply_cross_approach_ccf() method in the CRM processor handles the split EAD recalculation, computing four columns — ccf_original, ccf_guaranteed, ccf_unguaranteed, guarantee_ratio — that the EAD initialiser then reads. The split EAD is ead_unguaranteed + ead_guaranteed, where each leg uses its appropriate CCF.
Determining the guarantor's approach is the part that turned out to need careful regulatory reasoning. Under version 0.1.67's fix, guarantor_approach is set by three inputs together: the firm's IRB permissions for the guarantor's exposure class, whether the guarantor has an internal PD on file, and which framework is active. IRB applies only if the firm has IRB permission for the guarantor's class and the guarantor's internal_pd is non-null. SA applies otherwise — which includes the case of a guarantor with only an external rating (CQS, no PD), even when the firm has F-IRB permission for that class. The classic example is a sovereign guarantee: a sovereign guarantor GOV_01 with external rating CQS=1 and no internal PD lands on SA even when the firm holds F-IRB permission for sovereigns, because the substitution requires a PD to drive the IRB formula and the sovereign does not have one.
The earlier version of this code routed all guarantor substitution through one path keyed only on the guarantor's properties: if the guarantor had an internal PD it went IRB, if it had only a CQS it went SA. The fix made the routing beneficiary-aware: an SA exposure beneficiary always uses the guarantor's external CQS (because the SA exposure is in CQS-space anyway); an IRB exposure beneficiary uses the guarantor's internal PD when one exists, falling back to CQS otherwise. The F-IRB supervisory LGD used in PD substitution and EL blending now tracks the active framework as well — 0.45 under CRR, 0.40 under Basel 3.1 for senior unsecured corporate — instead of being hard-coded.
What this story demonstrates: cross-approach interactions are where most regulatory implementations have undocumented bugs. The substitution rule is the kind of thing that gets a one-line treatment in a Basel summary deck ("guarantees substitute the guarantor's risk weight") and a six-month implementation path when you actually try to honour it for a portfolio that mixes IRB exposures with SA-rated guarantors. The architecture from post 2 — frozen bundles, an explicit CounterpartyLookup for guarantor lookups, the crm_audit LazyFrame surfacing every substitution decision — is what made this finable as a bug rather than a vague suspicion that the numbers looked off.
What the four stories have in common¶
Each of these started with someone reading regulatory text carefully and noticing that the implementation, which produced reasonable-looking numbers, was operating on the wrong unit. MOF: the wrong unit was the parent's own risk_type instead of the descendants'. AIRB collateral: the wrong unit was per-row collateral allocation instead of pool-aware allocation. SME supporting factor: the wrong unit was per-counterparty instead of group-of-connected-clients. Cross-approach CCF: the wrong unit was the guarantor's properties instead of the protection slice's framework.
In all four cases, the data the regulation cared about was already present in the bundles. The hierarchy resolver was already producing parent-child facility relationships, lending-group totals, and the counterparty lookup that exposed the guarantor's internal_pd. The classifier was already partitioning exposures by approach. The CRM processor was already computing per-exposure CRM allocations. What was missing was the connection between those data elements and the regulatory rule that demanded the connection be honoured. Each fix is, fundamentally, the addition of one more correctly-typed dependency between two stages that were already producing the right primitives.
The architecture from post 2 makes those dependencies findable. Frozen bundles let you trace exactly which stage produced each column the rule depends on. The data/engine split forces every regulatory scalar (the SME factor, the AIRB pool gate, the framework-specific SA CCF) into named tables that an auditor can read in one place. Error accumulation surfaces violations as data-quality warnings rather than swallowing them in exceptions. The agent workflow from post 4 makes the fixes implementable in days rather than weeks, because the four-stage pipeline forces a regulatory citation and a hand-derived expected output before any engine code is touched.
What stays human, as before, is the part that still doesn't automate: reading the regulation carefully enough to notice the wrong unit in the first place. The fixes above were not generated by an agent that had a sudden insight about Art. 4(1)(39) connected-client aggregation. They were generated by an agent that had been told, by me, that the SME supporting factor was producing the wrong numbers on a particular acceptance fixture, and that the regulatory text said something specific about groups of connected clients, and could it please come up with a hand-derived expected output and a failing test. The agent then did a creditable job. The notice-the-bug step is the bottleneck.
Post 7 turns from regulatory archaeology to testing strategy: 5,300 tests, hand-derived golden files, hash-locked oracles, and the contracts that prevent the data layer from leaking into the engine.
Read next: Testing a Regulatory Engine: 5,300 Tests, Hand-Derived Goldens (in progress).
Further reading:
- Specifications: Credit Risk Mitigation (CRR) — full CRR CRM treatment.
- Specifications: Credit Risk Mitigation (Basel 3.1) — including the Art. 191A(2)(d) anti-double-counting rule.
- Changelog — the full set of closed items, including the four covered here at versions 0.2.3, 0.2.0, and 0.1.67.