Metrics & Governance: KRIs, KPIs & Board Oversight

This is Part 3 of the Frontier AI Threat Defense series. Part 1 set the strategy and Part 2 set the tactics; this brief is the measurement and assurance layer — the indicators, the board view, the examiner map, and the source discipline that lets every claim in the series be checked. It is written for risk functions, internal audit, and the board.

Key Risk Indicators

Thresholded, owned, and board-visible

Key Performance Indicators

Each with a three-year target ramp

Examining authorities mapped

OCC, FRB, NYDFS, SEC/FINRA, OSFI, AMF

A·B·C

Claim-class discipline

Every load-bearing claim is source-classed

What a measurable, examinable AI security programme tracks.

3.1 Key risk indicators

KRI	Definition	Threshold (illustrative)	Owner	Anchor
KRI-1	% of AI use cases not inventoried	Amber >5%, Red >10%	CDO/CISO	NYDFS 500.13; OSFI E-23
KRI-2	# of high-severity vulnerabilities open >30 days in apps exposed to AI-class adversary	Amber >50, Red >150	CISO	OCC; FFIEC AIO
KRI-3	# of shadow-AI incidents per month	Amber >25, Red >75	CISO	NYDFS 500.14
KRI-4	% of privileged users on non-phishing-resistant MFA	Amber >2%, Red >5%	CISO/IAM	NYDFS 500.12 Nov 2025
KRI-5	# of production agents without a documented kill-switch test in last 90 days	Amber ≥1, Red ≥3	CISO/Eng	SR 11-7; OSFI E-23
KRI-6	# of detected prompt-injection attempts on production agents	Trend	CISO	NIST AI 600-1
KRI-7	Concentration index (HHI) of foundation-model spend	Amber HHI >4000, Red >6000	CRO/CIO	Treasury Dec 2024; FSOC 2024
KRI-8	Deepfake-related fraud loss ($ and #) per quarter	Trend	CRO/Fraud	Treasury Mar 2024; FinCEN FIN-2024-ALERT004
KRI-9	# of unsanctioned model artefacts loaded in last 30 days	Red ≥1	CISO	OSFI E-23; SR 11-7
KRI-10	Mean time to detect an AI-enabled incident	Amber >24h, Red >72h	SOC	NYDFS 500.17
KRI-11	Cross-border inference events for Canadian-resident customer data	Red ≥1	DPO/CDO	PIPEDA; Quebec Law 25; OSFI B-10
KRI-12	Agents in control functions (AML/KYC/fraud/surveillance) without an externally-governed escalation channel	Red ≥1	CRO/CISO	OSFI E-23 §monitoring; Gomez 2025

3.2 Key performance indicators

Year 1 target Year 3 target

AI use cases inventoried & risk-rated

80%

99%

Internal LLM traffic via AI Security Gateway

60%

100%

Agent actions classified into the taxonomy

70%

100%

ATLAS techniques covered by red-team emulation

40%

90%

Production models with SR 11-7 / E-23 validation

50%

100%

Selected KPIs — the three-year maturity ramp. The gap between the bars is the build programme.

KPI	Definition	Year 1 / 2 / 3
KPI-1	% of AI use cases inventoried and risk-rated	80 / 95 / 99
KPI-2	% of internal LLM traffic routed through the AI Security Gateway	60 / 95 / 100
KPI-3	% of agent actions classified into the action-class taxonomy	70 / 95 / 100
KPI-4	% of MITRE ATLAS techniques covered by red-team emulation	40 / 70 / 90
KPI-5	% of production models with completed SR 11-7 / E-23 validation	50 / 90 / 100
KPI-6	% of privileged users on FIDO2 / passkeys	80 / 100 / 100
KPI-7	Mean time to revoke a compromised agent (kill-switch RTO)	≤5 min / ≤2 min / ≤60 sec
KPI-8	% of model artefacts signed and registered	70 / 95 / 100
KPI-9	% of staff trained on AI-specific phishing/deepfake	90 / 98 / 99
KPI-10	% of TPSP contracts updated with AI clauses (NYDFS 500.11 / OSFI B-10)	60 / 100 / 100
KPI-11	% of high-severity discoveries from internal AI scanning closed within 30 days	60 / 80 / 90

3.3 Board dashboard composition (quarterly, one page)

Tile	Content
Threat heatmap	Threat classes (offensive AI against firm / attacks against firm AI / systemic) × risk appetite (Green/Amber/Red)
Top-5 KRIs	KRI-1, KRI-4, KRI-7, KRI-10, KRI-12 — current value, trend, threshold
Incidents	AI-enabled incidents this quarter, MTTD, MTTR, regulatory notifications filed (NYDFS, OSFI, SEC, FinCEN, FINTRAC)
Capability maturity	RAG status on KPI-1 through KPI-7
Vendor concentration	% spend by foundation-model vendor, HHI, second-source readiness
Regulatory horizon	Countdown to OSFI E-23 (1 May 2027); NYDFS Part 500 next annual certification (15 Apr); SEC/FINRA exam priorities; OCC heightened-standards refreshes

3.4 Audit and examiner focus areas

Internal Audit annual plan coverage: AI use-case inventory completeness against OSFI E-23 and NYDFS Part 500.13; AI Security Gateway coverage and effectiveness; agent kill-switch testing evidence; externally-governed escalation channels on control-function agents; MRM 2.0 independence and validation depth; TPSP AI clauses and AI-BOM ingestion under OCC 2013-29 / SR 23-4 / OSFI B-10; shadow-AI controls; deepfake-resistant identity rollout; peer-breach contagion playbook.

External examination focus by authority:

Authority	Focus
OCC	Heightened-standards three-lines and risk-appetite evidence; AI extensions
Federal Reserve	SR 11-7 inventory, validation, performance monitoring for GenAI / agentic models
NYDFS	500.4 CISO report incl. AI; 500.6 audit trail; 500.7 access incl. agent identity; 500.11 TPSP non-delegation; 500.13 asset inventory; 500.14 training; 500.17 notification (24h extortion payment, 72h incident); Oct 2024 AI cyber letter; Oct 2025 TPSP letter
SEC / FINRA	Reg S-P 2024 (30-day customer notification); Reg SCI; FINRA Notice 24-09 and successors; Rule 4511 / SEC 17a-4 books-and-records applied to agent action logs
OSFI	B-13 self-assessment; B-10 third-party-risk for AI/cloud vendors; E-23 readiness against 1 May 2027; cyber-incident reporting advisory (24h); CFRG coordination
AMF	Model Risk Management Guideline parallel-track compliance (in effect since June 2025)

3.5 Source-integrity register

Every load-bearing claim across the three parts of this series is classed here — (A) primary source, (B) reputable third-party reporting, (C) analyst inference.

Claim	Class	Source	Notes
Glasswing partner list, pricing, donations	A	anthropic.com/glasswing, fetched 16 May 2026	Primary source
Mythos Preview benchmarks (SWE-bench, CyberGym, Cybench, etc.)	A	Anthropic Project Glasswing system card; VentureBeat 7 Apr 2026; llm-stats.com analysis	All scores self-reported by Anthropic; SWE-bench Multimodal uses an internal implementation
Firefox 147 exploit benchmark (181 working exploits)	A	Anthropic system card per Kingy AI analysis	Internal Anthropic benchmark
OpenAI Daybreak general framing	A	OpenAI announcement 12 May 2026	—
Daybreak model identifier	(inconsistent)	Third-party coverage; primary source not explicit	Use generic “Daybreak frontier model + Codex Security harness” in board materials
GTG-2002 and GTG-1002 disclosures	A	Anthropic threat reports Aug 2025, 14 Nov 2025	”AI executed 80–90% of tradecraft” is Anthropic’s assessment of an operation Anthropic disrupted; cite as such
AISLE Jagged Frontier reproduction	A	aisle.com/blog, 8 Apr 2026	Primary research
Calif.io CVE-2026-4747 RCE	A	blog.calif.io, 29 Mar 2026	Primary research
Anthropic Frontier Red Team 500+ vulnerabilities	A	Carlini et al., 5 Feb 2026	Pre-Mythos, used Opus 4.6
ATLAS technique IDs	A	MITRE ATLAS canonical pages; v5.1.0 release notes Nov 2025	Verified
Lynch et al. 2025 agentic misalignment	A	arXiv 2510.05179	Iterative red-teaming acknowledged in the paper
Gomez 2025 escalation channel mitigation	A	arXiv 2510.05192	Wiser Human
Macklem / Champagne / Bessent / Powell statements	B	Globe and Mail 17 Apr 2026; CBC News 21 Apr 2026; Bloomberg via SC Media 11 Apr 2026	—
Bailey BBC statement	B	FinTech Magazine via BBC, ~Apr 2026	—
Budget allocation (Part 1, §1.9)	C	Analyst inference	Indicative working assumption for a Tier-1 NA bank; calibrate against firm baseline
”Vendor signatures lag deepfake fidelity by 6–9 months”	C	Analyst inference from incident reporting cadence	—
“Fewer than 1% of Mythos vulnerabilities patched at launch”	B	Picus Security interpretation, Apr 2026	Not a direct Anthropic statement

3.6 Caveats

Vendor obsolescence. The AI security tooling space is consolidating. Vendor names in Part 1 (§1.8) should be treated as illustrative of category, not endorsement; the build-vs-buy reasoning holds regardless of which specific vendor occupies the category 18 months hence.

Regulator personnel changes. Specific regulator officials named in earlier versions of this brief should be verified before quoting in any external-facing document; the regulatory framework itself is stable while official leadership rotates.

Vendor self-reporting. All Anthropic and OpenAI threat-intelligence reporting describes operations the labs themselves disrupted. The operational tradecraft templates (TTPs, tool use, kill-chain stages) are usable for red-team scenario design regardless of attribution debates. Numeric claims about AI share of tradecraft are vendor self-assessments and warrant qualified citation.

Steady-state assumption. The series assumes the 18-month build window addresses the acute capability arbitrage of 2026–2027 and that AI risk becomes routine alongside operational, cyber, and model risk thereafter. If frontier capability advances faster than assumed — a Mythos-successor model in 2026–2027 with a similar offence–defence delta — the steady-state plan compresses and the budget allocation in Part 1 (§1.9) shifts further toward kill-chain defence.

3.1 Key risk indicators

3.2 Key performance indicators

3.3 Board dashboard composition (quarterly, one page)

3.4 Audit and examiner focus areas

3.5 Source-integrity register

3.6 Caveats

Bring this rigor to your own AI controls.