Skip to content
All white papers
White paper Frontier AI Threat Defense — Part 3 of 3

Metrics & Governance: KRIs, KPIs & Board Oversight

The measurement and assurance layer — key risk and performance indicators, the quarterly board dashboard, examiner focus areas, and the source-integrity discipline behind the series.

16 May 2026 Toronto v6 8 min read
Risk metrics Governance Audit & assurance
On this page

This is Part 3 of the Frontier AI Threat Defense series. Part 1 set the strategy and Part 2 set the tactics; this brief is the measurement and assurance layer — the indicators, the board view, the examiner map, and the source discipline that lets every claim in the series be checked. It is written for risk functions, internal audit, and the board.

12
Key Risk Indicators
Thresholded, owned, and board-visible
11
Key Performance Indicators
Each with a three-year target ramp
6
Examining authorities mapped
OCC, FRB, NYDFS, SEC/FINRA, OSFI, AMF
A·B·C
Claim-class discipline
Every load-bearing claim is source-classed
What a measurable, examinable AI security programme tracks.

3.1 Key risk indicators

KRIDefinitionThreshold (illustrative)OwnerAnchor
KRI-1% of AI use cases not inventoriedAmber >5%, Red >10%CDO/CISONYDFS 500.13; OSFI E-23
KRI-2# of high-severity vulnerabilities open >30 days in apps exposed to AI-class adversaryAmber >50, Red >150CISOOCC; FFIEC AIO
KRI-3# of shadow-AI incidents per monthAmber >25, Red >75CISONYDFS 500.14
KRI-4% of privileged users on non-phishing-resistant MFAAmber >2%, Red >5%CISO/IAMNYDFS 500.12 Nov 2025
KRI-5# of production agents without a documented kill-switch test in last 90 daysAmber ≥1, Red ≥3CISO/EngSR 11-7; OSFI E-23
KRI-6# of detected prompt-injection attempts on production agentsTrendCISONIST AI 600-1
KRI-7Concentration index (HHI) of foundation-model spendAmber HHI >4000, Red >6000CRO/CIOTreasury Dec 2024; FSOC 2024
KRI-8Deepfake-related fraud loss ($ and #) per quarterTrendCRO/FraudTreasury Mar 2024; FinCEN FIN-2024-ALERT004
KRI-9# of unsanctioned model artefacts loaded in last 30 daysRed ≥1CISOOSFI E-23; SR 11-7
KRI-10Mean time to detect an AI-enabled incidentAmber >24h, Red >72hSOCNYDFS 500.17
KRI-11Cross-border inference events for Canadian-resident customer dataRed ≥1DPO/CDOPIPEDA; Quebec Law 25; OSFI B-10
KRI-12Agents in control functions (AML/KYC/fraud/surveillance) without an externally-governed escalation channelRed ≥1CRO/CISOOSFI E-23 §monitoring; Gomez 2025

3.2 Key performance indicators

Year 1 target Year 3 target
AI use cases inventoried & risk-rated
80%
99%
Internal LLM traffic via AI Security Gateway
60%
100%
Agent actions classified into the taxonomy
70%
100%
ATLAS techniques covered by red-team emulation
40%
90%
Production models with SR 11-7 / E-23 validation
50%
100%
Selected KPIs — the three-year maturity ramp. The gap between the bars is the build programme.
KPIDefinitionYear 1 / 2 / 3
KPI-1% of AI use cases inventoried and risk-rated80 / 95 / 99
KPI-2% of internal LLM traffic routed through the AI Security Gateway60 / 95 / 100
KPI-3% of agent actions classified into the action-class taxonomy70 / 95 / 100
KPI-4% of MITRE ATLAS techniques covered by red-team emulation40 / 70 / 90
KPI-5% of production models with completed SR 11-7 / E-23 validation50 / 90 / 100
KPI-6% of privileged users on FIDO2 / passkeys80 / 100 / 100
KPI-7Mean time to revoke a compromised agent (kill-switch RTO)≤5 min / ≤2 min / ≤60 sec
KPI-8% of model artefacts signed and registered70 / 95 / 100
KPI-9% of staff trained on AI-specific phishing/deepfake90 / 98 / 99
KPI-10% of TPSP contracts updated with AI clauses (NYDFS 500.11 / OSFI B-10)60 / 100 / 100
KPI-11% of high-severity discoveries from internal AI scanning closed within 30 days60 / 80 / 90

3.3 Board dashboard composition (quarterly, one page)

TileContent
Threat heatmapThreat classes (offensive AI against firm / attacks against firm AI / systemic) × risk appetite (Green/Amber/Red)
Top-5 KRIsKRI-1, KRI-4, KRI-7, KRI-10, KRI-12 — current value, trend, threshold
IncidentsAI-enabled incidents this quarter, MTTD, MTTR, regulatory notifications filed (NYDFS, OSFI, SEC, FinCEN, FINTRAC)
Capability maturityRAG status on KPI-1 through KPI-7
Vendor concentration% spend by foundation-model vendor, HHI, second-source readiness
Regulatory horizonCountdown to OSFI E-23 (1 May 2027); NYDFS Part 500 next annual certification (15 Apr); SEC/FINRA exam priorities; OCC heightened-standards refreshes

3.4 Audit and examiner focus areas

Internal Audit annual plan coverage: AI use-case inventory completeness against OSFI E-23 and NYDFS Part 500.13; AI Security Gateway coverage and effectiveness; agent kill-switch testing evidence; externally-governed escalation channels on control-function agents; MRM 2.0 independence and validation depth; TPSP AI clauses and AI-BOM ingestion under OCC 2013-29 / SR 23-4 / OSFI B-10; shadow-AI controls; deepfake-resistant identity rollout; peer-breach contagion playbook.

External examination focus by authority:

AuthorityFocus
OCCHeightened-standards three-lines and risk-appetite evidence; AI extensions
Federal ReserveSR 11-7 inventory, validation, performance monitoring for GenAI / agentic models
NYDFS500.4 CISO report incl. AI; 500.6 audit trail; 500.7 access incl. agent identity; 500.11 TPSP non-delegation; 500.13 asset inventory; 500.14 training; 500.17 notification (24h extortion payment, 72h incident); Oct 2024 AI cyber letter; Oct 2025 TPSP letter
SEC / FINRAReg S-P 2024 (30-day customer notification); Reg SCI; FINRA Notice 24-09 and successors; Rule 4511 / SEC 17a-4 books-and-records applied to agent action logs
OSFIB-13 self-assessment; B-10 third-party-risk for AI/cloud vendors; E-23 readiness against 1 May 2027; cyber-incident reporting advisory (24h); CFRG coordination
AMFModel Risk Management Guideline parallel-track compliance (in effect since June 2025)

3.5 Source-integrity register

Every load-bearing claim across the three parts of this series is classed here — (A) primary source, (B) reputable third-party reporting, (C) analyst inference.

ClaimClassSourceNotes
Glasswing partner list, pricing, donationsAanthropic.com/glasswing, fetched 16 May 2026Primary source
Mythos Preview benchmarks (SWE-bench, CyberGym, Cybench, etc.)AAnthropic Project Glasswing system card; VentureBeat 7 Apr 2026; llm-stats.com analysisAll scores self-reported by Anthropic; SWE-bench Multimodal uses an internal implementation
Firefox 147 exploit benchmark (181 working exploits)AAnthropic system card per Kingy AI analysisInternal Anthropic benchmark
OpenAI Daybreak general framingAOpenAI announcement 12 May 2026
Daybreak model identifier(inconsistent)Third-party coverage; primary source not explicitUse generic “Daybreak frontier model + Codex Security harness” in board materials
GTG-2002 and GTG-1002 disclosuresAAnthropic threat reports Aug 2025, 14 Nov 2025”AI executed 80–90% of tradecraft” is Anthropic’s assessment of an operation Anthropic disrupted; cite as such
AISLE Jagged Frontier reproductionAaisle.com/blog, 8 Apr 2026Primary research
Calif.io CVE-2026-4747 RCEAblog.calif.io, 29 Mar 2026Primary research
Anthropic Frontier Red Team 500+ vulnerabilitiesACarlini et al., 5 Feb 2026Pre-Mythos, used Opus 4.6
ATLAS technique IDsAMITRE ATLAS canonical pages; v5.1.0 release notes Nov 2025Verified
Lynch et al. 2025 agentic misalignmentAarXiv 2510.05179Iterative red-teaming acknowledged in the paper
Gomez 2025 escalation channel mitigationAarXiv 2510.05192Wiser Human
Macklem / Champagne / Bessent / Powell statementsBGlobe and Mail 17 Apr 2026; CBC News 21 Apr 2026; Bloomberg via SC Media 11 Apr 2026
Bailey BBC statementBFinTech Magazine via BBC, ~Apr 2026
Budget allocation (Part 1, §1.9)CAnalyst inferenceIndicative working assumption for a Tier-1 NA bank; calibrate against firm baseline
”Vendor signatures lag deepfake fidelity by 6–9 months”CAnalyst inference from incident reporting cadence
“Fewer than 1% of Mythos vulnerabilities patched at launch”BPicus Security interpretation, Apr 2026Not a direct Anthropic statement

3.6 Caveats

Vendor obsolescence. The AI security tooling space is consolidating. Vendor names in Part 1 (§1.8) should be treated as illustrative of category, not endorsement; the build-vs-buy reasoning holds regardless of which specific vendor occupies the category 18 months hence.

Regulator personnel changes. Specific regulator officials named in earlier versions of this brief should be verified before quoting in any external-facing document; the regulatory framework itself is stable while official leadership rotates.

Vendor self-reporting. All Anthropic and OpenAI threat-intelligence reporting describes operations the labs themselves disrupted. The operational tradecraft templates (TTPs, tool use, kill-chain stages) are usable for red-team scenario design regardless of attribution debates. Numeric claims about AI share of tradecraft are vendor self-assessments and warrant qualified citation.

Steady-state assumption. The series assumes the 18-month build window addresses the acute capability arbitrage of 2026–2027 and that AI risk becomes routine alongside operational, cyber, and model risk thereafter. If frontier capability advances faster than assumed — a Mythos-successor model in 2026–2027 with a similar offence–defence delta — the steady-state plan compresses and the budget allocation in Part 1 (§1.9) shifts further toward kill-chain defence.

Bring this rigor to your own AI controls.

If this series maps to a problem on your desk, a short call is the fastest way to compare notes.