Citation Tracking with watchfire¶
The project uses watchfire to attach machine-readable regulatory citations to engine functions so that "which articles does our code cover?" stops being a grep question and starts being a build artefact.
What @cites does¶
@cites is a no-runtime-cost decorator. It parses a citation string at import time, attaches the resulting Citation object to the function as __watchfire__, and returns the function unchanged. The actual analysis runs offline via the watchfire CLI (AST-walks the project, builds a citation -> function matrix, validates each citation against a bundled rulebook index).
from watchfire import cites
@cites("CRR Art. 153(1)")
def calculate_k(pd: float, lgd: float, correlation: float) -> float:
...
When to add @cites¶
When you implement or modify a function whose responsibility maps cleanly onto a specific regulatory article, add a @cites(...) decorator alongside the existing docstring reference. The docstring stays — it's the prose explanation; the decorator is the index entry.
Dual citations (CRR and PS1/26)¶
The CRR and Basel 3.1 frameworks are co-located in the same engine modules; regime-divergent behaviour is selected by a cited rulepack Feature (pack.feature(...)) resolved from the run's regime_id, not by a config boolean. When a function implements both a CRR article AND its Basel 3.1 / PS1/26 equivalent, stack the decorators — CRR (primary) outer, PS1/26 (secondary) inner:
@cites("CRR Art. 163")
@cites("PS1/26, paragraph 163")
def apply_pd_floor(self, config: CalculationConfig) -> pl.LazyFrame:
...
Today watchfire 0.3.1's __watchfire__ attribute holds only the outer decorator (CRR Art. 163 in the example), so watchfire matrix reports the CRR citation and ignores the inner one. Once upstream watchfire stacks citations into a tuple, both decorators will surface in the matrix with no source-code change here.
Canonical citation forms¶
| Instrument | Canonical form | Example |
|---|---|---|
| CRR article | CRR Art. N or CRR Art. N(p)(point) |
CRR Art. 153(1), CRR Art. 124(2)(a) |
| PRA Policy Statement | PSn/yy or PSn/yy, paragraph N.N |
PS1/26, PS1/26, paragraph 4.55 |
| PRA Supervisory Statement | SSn/yy[, paragraph N.N] |
SS1/23, paragraph 2.5 |
| PRA Rulebook | PRA Rulebook, <part>[, N.N] |
PRA Rulebook, Credit Risk, 3.2 |
| Delegated Regulation | Delegated Regulation <id>, Art. N |
Delegated Regulation 2018/171, Art. 1 |
Notes:
- Article numbers may include a lowercase alphabetic suffix (e.g. 501a, 501b). watchfire 0.3.1 parses these and the bundled CRR index covers 501a. Articles introduced by Basel 3.1 with no CRR equivalent (123B, 110A) are still absent from the CRR index — for those, cite the PS1/26, paragraph N form at instrument level and document the sub-article in the docstring/comment.
- PS / SS paragraphs are numeric (e.g. paragraph 4.55); the B/A suffixes used in CRR-style amendment articles are not valid paragraph IDs.
Running the tooling¶
# Validate every @cites against the bundled rulebook index.
uv run watchfire check
# Produce a coverage matrix (citation -> functions).
uv run watchfire matrix --format markdown
uv run watchfire matrix --instrument CRR
uv run watchfire matrix --article 153
watchfire check is invoked automatically by uv run python scripts/arch_check.py as the final step of the architectural gate. The wrapper treats PS / PRA Rulebook unknown_article findings as soft warnings (the bundled upstream index covers only CRR comprehensively today); CRR-side parse failures, unknown instruments, and version mismatches remain fatal.
Configuration¶
The [tool.watchfire] table in pyproject.toml controls source paths, allowed instruments, and the rulebook version pin:
[tool.watchfire]
rulebook_version = "2026-05-15"
instruments = ["CRR", "PS", "PRA_RULEBOOK", "SS"]
source_paths = ["src/rwa_calc/engine"]
Bump rulebook_version when upstream watchfire ships an updated index. The @cites source surface is intentionally narrow: engine/ (rule application) only — UI, IO, contracts, and tests hold no regulatory rules and would just add noise to the matrix. Regulatory values no longer live in data/tables/ (that package was removed in Phase 5); they live in src/rwa_calc/rulebook/packs/ as Citation data, validated separately and deliberately not added to source_paths (see Pack-data citations below). The source_paths entry "src/rwa_calc/data/tables" still present in pyproject.toml is a stale dead path that should be removed.
Coverage matrix¶
The full citation -> function matrix is published at citation-matrix.md. It's generated by scripts/generate_citation_matrix.py from the live @cites(...) decorators. Each regulatory article is its own ### heading, with one click-to-expand collapsible per implementing function — the expand reveals the full function body (decorators included) pulled live from the source file via pymdownx.snippets line-range includes, so the matrix never drifts from the code.
CRR Articles 111-152 (the SA + IRB exposure-class chapter) are rendered densely: every article in that range is guaranteed to appear in the matrix either as one or more implementing-function collapsibles or as an italic "Out of scope — reason" block (e.g. UK-omitted articles, supervisory-permission articles, definition-only articles). The list of deliberately-not-cited articles and their reasons lives in CRR_COVERAGE_NOTES at the top of scripts/generate_citation_matrix.py. Articles outside the dense range render sparsely — only those with @cites decorators appear.
Regenerate after annotation changes:
The script invokes uv run watchfire matrix --format json once per indexed instrument (CRR, PS1/26), uses Python's ast module to resolve each citation site's full function range, and writes article-grouped Markdown with ??? quote collapsibles and --8<-- "path:start:end" snippet directives. Wire it into CI on push-to-master if you want the published site to track every commit. If a citation site can't be resolved to a function definition (renamed function, refactored file), the script falls back to a fixed line window and emits a "Generator warnings" section at the bottom of the page so the issue is visible rather than silent.
Regression test¶
tests/contracts/test_watchfire_coverage.py pins the inventory: each whitelisted function is asserted to carry the expected canonical citation tuple. Adding a new @cites(...) decorator means adding a row to that test's WHITELIST. Removing a decorator means removing the matching row. The test guards against accidental decorator deletion during refactors — the failure surfaces as a named parameterised assertion rather than a silent matrix shrink.
Pack-data citations¶
@cites covers regulatory citations on engine functions. The Phase 5 rulepack adds a second citation surface: every entry in src/rwa_calc/rulebook/packs/{common,crr,b31}.py (ScalarParam, LookupTable, DecisionTable, Feature, …) carries a Citation as data, not a decorator. After the Phase 5 table-move the pack is the regulatory value-home, so these citations must be as well-formed and index-covered as the engine's decorators.
watchfire cannot ingest external (non-decorator) citations, so the project bridges to it:
rwa_calc.rulebook.audit.pack_citation_index(reporting_date)resolves both regimes and maps each distinctstr(citation)to the entry names citing it — the pack-data analogue of watchfire's article → function matrix.scripts/arch_check.py::check_pack_citationsvalidates every pack citation with watchfire's own grammar/index (parse_citation+index.covers), using the same fatal/soft policy as the@citescheck (check_watchfire_citations): parse failures, unknown instruments, and uncovered articles are fatal — watchfire is the oracle, the project owns the harness.tests/contracts/test_pack_citation_coverage.pyruns the same validation in the pytest suite and pins a density ratchet (the distinct-citation count may grow but never shrink).
Two deliberate non-changes: rulebook/packs/ is not added to [tool.watchfire].source_paths (the packs hold no @cites decorators — only Citation data), and there is no new [tool.watchfire] key (watchfire's config rejects unknown keys).
Pack citations must therefore follow the canonical forms above. In particular, PS1/26 entries cite at instrument level (PS1/26, paragraph 153) with any sub-article detail in the Citation.note — the parenthesised CRR-style sub-article form (paragraph 153(1)) is not a valid watchfire paragraph id and will fail the gate.
A small set of articles is legitimately outside watchfire's bundled credit-risk index and is recorded as a documented soft-warn in PACK_CITATION_SOFT_ALLOWLIST (e.g. CRR Art. 128, omitted from UK CRR by SI 2021/1078; CRR Art. 274, SA-CCR). New soft-allowlist entries require explicit regulatory justification.
Strict gate¶
scripts/arch_check.py invokes watchfire.checks.run_check as the final gate step. Parse failures, unknown instruments, unknown articles (any instrument), and version mismatches are fatal; only AST unresolved findings degrade to soft warnings. There is no soft-warning escape hatch for PS / PRA Rulebook citations — the 0.3.0 index covers PS1/26 with 4,498 rows, so any unresolved PS citation indicates a real typo rather than upstream sparsity.