Metrics & Governance: KRIs, KPIs & Board Oversight
The measurement and assurance layer — key risk and performance indicators, the quarterly board dashboard, examiner focus areas, and the source-integrity discipline behind the series.
On this page
This is Part 3 of the Frontier AI Threat Defense series. Part 1 set the strategy and Part 2 set the tactics; this brief is the measurement and assurance layer — the indicators, the board view, the examiner map, and the source discipline that lets every claim in the series be checked. It is written for risk functions, internal audit, and the board.
3.1 Key risk indicators
| KRI | Definition | Threshold (illustrative) | Owner | Anchor |
|---|---|---|---|---|
| KRI-1 | % of AI use cases not inventoried | Amber >5%, Red >10% | CDO/CISO | NYDFS 500.13; OSFI E-23 |
| KRI-2 | # of high-severity vulnerabilities open >30 days in apps exposed to AI-class adversary | Amber >50, Red >150 | CISO | OCC; FFIEC AIO |
| KRI-3 | # of shadow-AI incidents per month | Amber >25, Red >75 | CISO | NYDFS 500.14 |
| KRI-4 | % of privileged users on non-phishing-resistant MFA | Amber >2%, Red >5% | CISO/IAM | NYDFS 500.12 Nov 2025 |
| KRI-5 | # of production agents without a documented kill-switch test in last 90 days | Amber ≥1, Red ≥3 | CISO/Eng | SR 11-7; OSFI E-23 |
| KRI-6 | # of detected prompt-injection attempts on production agents | Trend | CISO | NIST AI 600-1 |
| KRI-7 | Concentration index (HHI) of foundation-model spend | Amber HHI >4000, Red >6000 | CRO/CIO | Treasury Dec 2024; FSOC 2024 |
| KRI-8 | Deepfake-related fraud loss ($ and #) per quarter | Trend | CRO/Fraud | Treasury Mar 2024; FinCEN FIN-2024-ALERT004 |
| KRI-9 | # of unsanctioned model artefacts loaded in last 30 days | Red ≥1 | CISO | OSFI E-23; SR 11-7 |
| KRI-10 | Mean time to detect an AI-enabled incident | Amber >24h, Red >72h | SOC | NYDFS 500.17 |
| KRI-11 | Cross-border inference events for Canadian-resident customer data | Red ≥1 | DPO/CDO | PIPEDA; Quebec Law 25; OSFI B-10 |
| KRI-12 | Agents in control functions (AML/KYC/fraud/surveillance) without an externally-governed escalation channel | Red ≥1 | CRO/CISO | OSFI E-23 §monitoring; Gomez 2025 |
3.2 Key performance indicators
| KPI | Definition | Year 1 / 2 / 3 |
|---|---|---|
| KPI-1 | % of AI use cases inventoried and risk-rated | 80 / 95 / 99 |
| KPI-2 | % of internal LLM traffic routed through the AI Security Gateway | 60 / 95 / 100 |
| KPI-3 | % of agent actions classified into the action-class taxonomy | 70 / 95 / 100 |
| KPI-4 | % of MITRE ATLAS techniques covered by red-team emulation | 40 / 70 / 90 |
| KPI-5 | % of production models with completed SR 11-7 / E-23 validation | 50 / 90 / 100 |
| KPI-6 | % of privileged users on FIDO2 / passkeys | 80 / 100 / 100 |
| KPI-7 | Mean time to revoke a compromised agent (kill-switch RTO) | ≤5 min / ≤2 min / ≤60 sec |
| KPI-8 | % of model artefacts signed and registered | 70 / 95 / 100 |
| KPI-9 | % of staff trained on AI-specific phishing/deepfake | 90 / 98 / 99 |
| KPI-10 | % of TPSP contracts updated with AI clauses (NYDFS 500.11 / OSFI B-10) | 60 / 100 / 100 |
| KPI-11 | % of high-severity discoveries from internal AI scanning closed within 30 days | 60 / 80 / 90 |
3.3 Board dashboard composition (quarterly, one page)
| Tile | Content |
|---|---|
| Threat heatmap | Threat classes (offensive AI against firm / attacks against firm AI / systemic) × risk appetite (Green/Amber/Red) |
| Top-5 KRIs | KRI-1, KRI-4, KRI-7, KRI-10, KRI-12 — current value, trend, threshold |
| Incidents | AI-enabled incidents this quarter, MTTD, MTTR, regulatory notifications filed (NYDFS, OSFI, SEC, FinCEN, FINTRAC) |
| Capability maturity | RAG status on KPI-1 through KPI-7 |
| Vendor concentration | % spend by foundation-model vendor, HHI, second-source readiness |
| Regulatory horizon | Countdown to OSFI E-23 (1 May 2027); NYDFS Part 500 next annual certification (15 Apr); SEC/FINRA exam priorities; OCC heightened-standards refreshes |
3.4 Audit and examiner focus areas
Internal Audit annual plan coverage: AI use-case inventory completeness against OSFI E-23 and NYDFS Part 500.13; AI Security Gateway coverage and effectiveness; agent kill-switch testing evidence; externally-governed escalation channels on control-function agents; MRM 2.0 independence and validation depth; TPSP AI clauses and AI-BOM ingestion under OCC 2013-29 / SR 23-4 / OSFI B-10; shadow-AI controls; deepfake-resistant identity rollout; peer-breach contagion playbook.
External examination focus by authority:
| Authority | Focus |
|---|---|
| OCC | Heightened-standards three-lines and risk-appetite evidence; AI extensions |
| Federal Reserve | SR 11-7 inventory, validation, performance monitoring for GenAI / agentic models |
| NYDFS | 500.4 CISO report incl. AI; 500.6 audit trail; 500.7 access incl. agent identity; 500.11 TPSP non-delegation; 500.13 asset inventory; 500.14 training; 500.17 notification (24h extortion payment, 72h incident); Oct 2024 AI cyber letter; Oct 2025 TPSP letter |
| SEC / FINRA | Reg S-P 2024 (30-day customer notification); Reg SCI; FINRA Notice 24-09 and successors; Rule 4511 / SEC 17a-4 books-and-records applied to agent action logs |
| OSFI | B-13 self-assessment; B-10 third-party-risk for AI/cloud vendors; E-23 readiness against 1 May 2027; cyber-incident reporting advisory (24h); CFRG coordination |
| AMF | Model Risk Management Guideline parallel-track compliance (in effect since June 2025) |
3.5 Source-integrity register
Every load-bearing claim across the three parts of this series is classed here — (A) primary source, (B) reputable third-party reporting, (C) analyst inference.
| Claim | Class | Source | Notes |
|---|---|---|---|
| Glasswing partner list, pricing, donations | A | anthropic.com/glasswing, fetched 16 May 2026 | Primary source |
| Mythos Preview benchmarks (SWE-bench, CyberGym, Cybench, etc.) | A | Anthropic Project Glasswing system card; VentureBeat 7 Apr 2026; llm-stats.com analysis | All scores self-reported by Anthropic; SWE-bench Multimodal uses an internal implementation |
| Firefox 147 exploit benchmark (181 working exploits) | A | Anthropic system card per Kingy AI analysis | Internal Anthropic benchmark |
| OpenAI Daybreak general framing | A | OpenAI announcement 12 May 2026 | — |
| Daybreak model identifier | (inconsistent) | Third-party coverage; primary source not explicit | Use generic “Daybreak frontier model + Codex Security harness” in board materials |
| GTG-2002 and GTG-1002 disclosures | A | Anthropic threat reports Aug 2025, 14 Nov 2025 | ”AI executed 80–90% of tradecraft” is Anthropic’s assessment of an operation Anthropic disrupted; cite as such |
| AISLE Jagged Frontier reproduction | A | aisle.com/blog, 8 Apr 2026 | Primary research |
| Calif.io CVE-2026-4747 RCE | A | blog.calif.io, 29 Mar 2026 | Primary research |
| Anthropic Frontier Red Team 500+ vulnerabilities | A | Carlini et al., 5 Feb 2026 | Pre-Mythos, used Opus 4.6 |
| ATLAS technique IDs | A | MITRE ATLAS canonical pages; v5.1.0 release notes Nov 2025 | Verified |
| Lynch et al. 2025 agentic misalignment | A | arXiv 2510.05179 | Iterative red-teaming acknowledged in the paper |
| Gomez 2025 escalation channel mitigation | A | arXiv 2510.05192 | Wiser Human |
| Macklem / Champagne / Bessent / Powell statements | B | Globe and Mail 17 Apr 2026; CBC News 21 Apr 2026; Bloomberg via SC Media 11 Apr 2026 | — |
| Bailey BBC statement | B | FinTech Magazine via BBC, ~Apr 2026 | — |
| Budget allocation (Part 1, §1.9) | C | Analyst inference | Indicative working assumption for a Tier-1 NA bank; calibrate against firm baseline |
| ”Vendor signatures lag deepfake fidelity by 6–9 months” | C | Analyst inference from incident reporting cadence | — |
| “Fewer than 1% of Mythos vulnerabilities patched at launch” | B | Picus Security interpretation, Apr 2026 | Not a direct Anthropic statement |
3.6 Caveats
Vendor obsolescence. The AI security tooling space is consolidating. Vendor names in Part 1 (§1.8) should be treated as illustrative of category, not endorsement; the build-vs-buy reasoning holds regardless of which specific vendor occupies the category 18 months hence.
Regulator personnel changes. Specific regulator officials named in earlier versions of this brief should be verified before quoting in any external-facing document; the regulatory framework itself is stable while official leadership rotates.
Vendor self-reporting. All Anthropic and OpenAI threat-intelligence reporting describes operations the labs themselves disrupted. The operational tradecraft templates (TTPs, tool use, kill-chain stages) are usable for red-team scenario design regardless of attribution debates. Numeric claims about AI share of tradecraft are vendor self-assessments and warrant qualified citation.
Steady-state assumption. The series assumes the 18-month build window addresses the acute capability arbitrage of 2026–2027 and that AI risk becomes routine alongside operational, cyber, and model risk thereafter. If frontier capability advances faster than assumed — a Mythos-successor model in 2026–2027 with a similar offence–defence delta — the steady-state plan compresses and the budget allocation in Part 1 (§1.9) shifts further toward kill-chain defence.