Skip to content
All white papers
White paper Frontier AI Threat Defense — Part 1 of 3

Strategic Context: The Frontier AI Offence–Defence Shift

Why vulnerability discovery has been commoditised, why autonomous offence still favours frontier capability, and what the regulatory clock now demands.

16 May 2026 Toronto v6 14 min read
Cyber risk Banking & securities Regulatory compliance
On this page

Scope: OCC / FRB / SEC / FINRA-regulated firms with OSFI / AMF cross-border operations. This is Part 1 of a three-part series; it stands on its own as the strategic brief and sets the framing the tactical and governance parts build on.

Anchored on the two named frontier-lab cyber initiatives: Anthropic’s Project Glasswing (announced 7 April 2026, gated access to Claude Mythos Preview) and OpenAI’s Daybreak (announced 12 May 2026, Codex Security agentic harness + frontier model with a Trusted Access tier for authorised cyber testing).

~90×
Exploit-generation delta
Mythos Preview vs Opus 4.6 on the Firefox 147 benchmark
$0.11
Per-M-token discovery
Open-weight model reproducing the flagship vulnerability finding
80–90%
Autonomous tradecraft
AI-executed share of the GTG-1002 operation, per Anthropic
1 May 2027
OSFI E-23 deadline
Hard compliance date for AI/ML and third-party model risk
Four numbers that frame the series — capability, economics, threat, and the regulatory clock.

How to read this brief

Three claim classes are used throughout the series. (A) Primary-source statements from Anthropic, OpenAI, regulators, or named regulator officials on record. (B) Reputable third-party reporting. (C) Analyst inference from (A) and (B), explicitly flagged with “inference:”. The source-integrity register in Part 3 catalogues every load-bearing claim by class.

Two thesis statements organise everything below.

1.1 The two frontier-lab initiatives — side by side

DimensionAnthropic Project GlasswingOpenAI Daybreak
Announced7 April 202612 May 2026
Underlying modelClaude Mythos Preview (unreleased frontier; benchmark deltas vs Opus 4.6 in §1.2)Frontier model + Codex Security agentic harness (model identifier inconsistent across third-party coverage; OpenAI’s own naming not yet stabilised)
Access modelGated to 12 launch partners + ~40 critical-infrastructure organisations; not publicly releasedTiered: general availability for defensive use; Trusted Access tier reserved for authorised pen-testing and red-team partners
Bank in launch cohortJPMorganChase (only named bank in launch cohort)Not disclosed
Pricing$25 / $125 per M input/output tokens (Claude API, Amazon Bedrock, Google Vertex AI, Microsoft Foundry); 5× Opus 4.6Not disclosed at announcement
Financial commitment$100M usage credits; $2.5M to Alpha-Omega/OpenSSF via Linux Foundation; $1.5M to Apache Software FoundationNot disclosed
Showcase findings27-yr OpenBSD remote-crash; 16-yr FFmpeg; Linux kernel privilege-escalation chain; CVE-2026-4747 FreeBSD RPCSEC_GSS RCEStated focus on secure code review, threat modelling, patch validation, dependency risk, detection, remediation guidance
Public reporting commitment90-day report (expected ~6 Jul 2026)Not specified
Strategic positioningDefensive uplift; restrict frontier offensive capability proliferation”Tilt balance toward defenders”; competitive response to Glasswing

1.2 Mythos Preview benchmark deltas (Anthropic self-reported)

Mythos Preview Opus 4.6
SWE-bench Verified
93.9%
80.8%
SWE-bench Pro
77.8%
53.4%
Terminal-Bench 2.0
82%
65.4%
CyberGym — vulnerability reproduction
83.1%
66.6%
Selected paired benchmarks. Bars animate the Mythos Preview / Opus 4.6 gap; full data — including non-paired results — in the table below.
BenchmarkMythos PreviewOpus 4.6Delta
SWE-bench Verified93.9%80.8%+13.1 pp
SWE-bench Pro77.8%53.4%+24.4 pp
Terminal-Bench 2.082.0%65.4%+16.6 pp
CyberGym (vulnerability reproduction)83.1%66.6%+16.5 pp
Cybench (CTF challenges)100% (saturated)Mythos is the first model to saturate Cybench
GPQA Diamond94.6%
USAMO 2026 (mathematical reasoning)97.6%
Firefox 147 exploitation benchmark (Anthropic-internal)181 working exploits + register control on 29 more, in same several-hundred-attempt budget2 working exploits in several hundred attempts~90× delta on this specific exploit-generation task
OSS-Fuzz ~7,000 entry pointsFull control-flow hijack (Tier 5) on 10 separate fully-patched targetsOne Tier-3 crash each (Sonnet 4.6 and Opus 4.6)Step change in exploit severity

All benchmark scores in this table are self-reported by Anthropic per the Project Glasswing system card; SWE-bench Multimodal uses an internal implementation not directly comparable to public leaderboards. The Firefox 147 and OSS-Fuzz deltas are the more strategically significant — they measure exploit construction rather than vulnerability recognition.

1.3 Capability timeline — Aug 2025 to May 2026

  1. Aug 2025
    Threat report: GTG-2002 — "vibe hacking"

    A single criminal actor uses an agentic coding tool for recon, credential harvesting, exfiltration and extortion across 17 organisations in one month.

  2. Sep–Nov 2025
    Threat report: GTG-1002

    Chinese state-linked operation; an agentic harness runs against ~30 targets including financial institutions. Anthropic assesses AI executed 80–90% of hands-on tradecraft.

  3. 11 Sep 2025
    OSFI Guideline E-23 (2027) final published

    Applies to all FRFIs including foreign bank branches and insurers; AI/ML and third-party models in scope.

  4. 1 Nov 2025
    NYDFS Part 500 Second Amendment — final phase

    Universal phishing-resistant MFA; documented asset-inventory program including AI/ML.

  5. 5 Feb 2026
    Frontier Red Team paper (Carlini et al.)

    500+ validated high-severity vulnerabilities found with Claude Opus 4.6 — before Mythos.

  6. 24 Feb 2026
    Anthropic loosens RSP commitments

    The 2023 commitment to guarantee safety adequacy in advance is replaced with a non-binding framework.

  7. 26 Mar 2026
    FreeBSD patches CVE-2026-4747

    A 17-year-old RPCSEC_GSS stack overflow; remote unauthenticated RCE.

  8. 29 Mar 2026
    Calif.io publishes a working 15-round RCE for CVE-2026-4747

    End-to-end autonomous exploit construction by a non-frontier model (Opus 4.6).

  9. 7 Apr 2026
    Project Glasswing announced

    Defensive uplift; gated frontier access. JPMorganChase is the only named bank in the launch cohort.

  10. 8 Apr 2026
    AISLE "Jagged Frontier"

    8 of 8 tested models detect CVE-2026-4747, including open-weight GPT-OSS-20B (3.6B active params) at $0.11/M tokens.

  11. Apr 2026
    Treasury and the Fed convene top US bank CEOs on Mythos

    Within the same fortnight the Bank of Canada, Finance Canada and the CFRG engage in parallel.

  12. 12 May 2026
    OpenAI Daybreak announced

    "Tilt the balance toward defenders" — a Codex Security agentic harness plus a Trusted Access tier.

  13. ~6 Jul 2026
    First Glasswing 90-day public report expected

    The first primary-source read on real-world defensive impact.

Highlighted entries mark the load-bearing events: the two frontier-lab initiatives, the GTG-1002 disclosure, and the first 90-day report.

1.4 Capability state — by task class

Task classCapability stateEmpirical floor
Detection of known vulnerability classes in scoped codeCommoditisedOpen-weight 3.6B-active-parameter model at $0.11/M tokens (AISLE, 8 April 2026)
End-to-end exploit construction on a chosen targetApproaches commodityOpus 4.6 + scaffolding (Calif.io CVE-2026-4747, 29 Mar 2026)
Exploit-generation throughput on a hardened target (Firefox 147)Frontier-capability advantagedMythos produces ~90× more working exploits than Opus 4.6 in same attempt budget (Anthropic Firefox 147 benchmark)
Multi-stage autonomous kill-chain across many targetsFrontier-capability advantagedClaude Code in GTG-1002 per Anthropic’s assessment

The two-row split at the bottom is important. Discovery is commodity; exploit-generation throughput and end-to-end orchestration are not. The Mythos Firefox 147 delta is the cleanest publicly-documented data point on this distinction.

1.5 Defender’s irreducible advantage

This is the load-bearing thesis of the series. Every recommendation across all three parts ties back to it.

Attacker hasAttacker lacksDefender has
Frontier model access (Mythos via partner; Daybreak Trusted Access)Data classification mapInternal data classification
Codebase access (post-compromise)Reachability graphInternet-reachability map
ComputeRegulatory-scope taggingRegulatory-scope tagging
Adversarial scaffoldingMNPI partitioningMNPI watchlist and Chinese-wall map
Open-weight toolingEngineering ownership routingRepository → team mapping
Public threat-intel feedsCompensating-controls inventoryWAF, network segmentation, monitoring coverage

1.6 Sector-specific exposures

ExposureMechanismAnchor
Patching pipeline as binding constraintAI-augmented discovery compresses disclosure-to-exploit from months to hoursFFIEC AIO booklet; OSFI B-13 D4; NYDFS 500.5
MNPI / information barriers cannot be outsourcedVendor lacks firm’s deal codenames, restricted-name lists, Chinese-wall mapSEC Rule 10b5-1; FINRA Rule 5280
Agentic banking in productionFinance close, GL reconciliation, market research, fraud / AML triage operational at multiple Tier-1 banks as of May 2026SR 11-7; OSFI E-23 (1 May 2027); AMF MRM (Jun 2025)
Foundation-model vendor concentrationNamed in regulation as systemic-risk vectorTreasury Dec 2024 §AI in Financial Services; FSOC 2024 Annual Report
Cross-border inference for Canadian customer dataUS-region inference of Canadian PII = reportable outsourcing/privacy eventPIPEDA; Quebec Law 25; OSFI B-10
Peer-breach contagionPublic disclosure of an AI-enabled breach at a peer institution triggers customer trust shock, examiner attention, and tabletop-revealed control gaps becoming public-facing issuesOperational risk; reputation risk; OSFI B-13 §business continuity

The last row is new in v6: peer-breach contagion is a real planning surface, not a residual.

1.7 Regulatory framework — US and Canadian convergence and divergence

DomainUS instrumentCanadian instrumentKey divergence
Phishing-resistant MFA; voice/video cautionNYDFS 23 NYCRR 500.12 (eff. 1 Nov 2025); Industry Letter 16 Oct 2024OSFI B-13 D2 (Identity & Access Management)NYDFS more prescriptive on factor types; OSFI more risk-based
AI/ML asset inventoryNYDFS 500.13 (eff. 1 Nov 2025)OSFI E-23 §model inventory (eff. 1 May 2027)E-23 broader: covers all AI/ML and third-party models; SR 11-7 inventory is narrower in practice
Cyber incident notificationNYDFS 500.17: 24h extortion payment, 72h incidentOSFI B-13: 24h technology incidentOSFI clock starts at assessment, not at determination of materiality
Customer notificationSEC Reg S-P: 30 days (large firms Dec 2025; small 3 Jun 2026)PIPEDA: “as soon as feasible”; Quebec Law 25: timeline by regulationUS clock is fixed; Canadian clock is qualitative — affects communications planning
Model risk managementFRB SR 11-7 (2011, extended by analogy to GenAI/agents)OSFI E-23 (eff. 1 May 2027); AMF MRM (eff. Jun 2025)E-23 explicitly scopes AI/ML and third-party models; SR 11-7 extends by analogy
Third-party risk for AI vendorsOCC Bulletin 2013-29; FRB SR 23-4; Interagency TPRM 2023OSFI Guideline B-10 (eff. 1 May 2024)Both treat AI/cloud as in-scope; B-10 has more granular sub-outsourcing obligations
Cyber program governanceOCC Heightened Standards (large banks); NYDFS 500.4OSFI B-13 §governanceOCC heightened-standards three-lines model is more prescriptive on independence
Generative AI risk profileNIST AI 600-1 (Jul 2024)OSFI E-23 references international frameworksNeither is mandatory; both are reference
Deepfake fraudFinCEN FIN-2024-ALERT004 (Nov 2024)FINTRAC Operational Alert (parallel)US guidance more specific on schemes
Cyber-risk in financial servicesTreasury Mar 2024; Treasury Dec 2024 RFI follow-upBank of Canada FSR; OSFI annual reportsTreasury Dec 2024 explicitly named foundation-model concentration

The E-23 / SR 11-7 divergence matters for cross-border firms. SR 11-7 dates to 2011 and is being stretched to cover GenAI by analogy and supervisory practice; E-23 was rewritten explicitly to cover AI/ML and third-party models in scope. A Canadian-domiciled institution faces a more explicit framework with a hard 1 May 2027 deadline. A US-domiciled institution with Canadian operations faces both — and the Canadian framework is the binding constraint on a unified group-wide AI MRM build.

1.8 Capability source — commodity vs firm-built

CapabilityBuyBuildRationale (tied to §1.5 thesis)
Frontier-class autonomous discovery (raw)Commodity model mix (open-weight + frontier API); Glasswing or Daybreak partner access as supplementCapability is commodity; the system around it is the moat
Discovery-to-remediation orchestration over firm codebaseComponents (commercial SAST/DAST, AI-BOM tooling, dependency-risk platforms)Firm-specific orchestration layerBusiness-context graph, regulator-scope tags, MNPI partitioning, engineering-ownership routing are firm-only
AI Security Gateway / LLM firewallVendor baseline (multiple commercial options)Firm policy engine on topVendor handles prompt-injection signatures; firm owns MNPI watchlist, Chinese-wall map, restricted-name policy
Identity + wire-room hardening against deepfakeFIDO2/passkeys; hardware tokens; livenessOrchestration and callback workflowNYDFS Oct 2024 explicitly cautions against voice/video as MFA factors
Continuous adversarial emulationCommercial BAS; red-team-as-a-service; Daybreak Trusted Access tierATLAS-mapped emulation against firm-specific agent topologyNo commercial product knows firm’s agents, RAG sources, or entitlement graph
MRM for GenAI / agentsCommercial MRM toolingValidation methodology, challenger framework, inventorySR 11-7 and OSFI E-23 require firm-specific independent validation
Agentic guardrailsCloud-platform frameworks (AWS Bedrock Agents guardrails; Microsoft 365 Copilot Studio governance)Decision-rights and approval workflow; externally-governed escalation channelAutonomous-vs-supervised action classification is firm decision-rights work
Threat intelligence for AI-specific IoCs/TTPsFS-ISAC; Anthropic, OpenAI, Microsoft, Google feeds; MITRE ATLASCorrelation with internal telemetryTI value lies in fusion with internal signal
Customer-facing fraud controls against AI scamsCore fraud platformsDeepfake-aware workflow and customer education layerVendor signatures lag deepfake fidelity by 6–9 months
AI-BOM and model supply-chain assuranceSBOM/AI-BOM toolingAttestation, signed model registry, RAG provenanceSR 11-7, OSFI E-23, B-10, OCC 2013-29 push provenance to firm
Shadow-AI insider-threat programDLP/UEBA/CASBPrompt/response telemetry and investigative workflowExisting UEBA cannot reason about prompt semantics

The AI security tooling space is consolidating fast — vendor names should be treated as illustrative of category, not endorsement (see the caveats in Part 3).

1.9 Indicative AI security budget allocation — Tier-1 NA bank, FY2026–2027

40%
35%
15%
10%
  • 40% Discovery-side orchestration
  • 35% Kill-chain defence
  • 15% MRM, governance & audit infrastructure
  • 10% Threat intel, red-team automation & sectoral cooperation
Indicative Year-1 allocation for a Tier-1 North American bank. Analyst inference (Class C) — calibrate against firm baseline.
WorkstreamYear 1 %Year 2–3 %What the spend buys
Discovery-side orchestration~40%~30%Codebase ingestion at scale; business-context graph; deduplication and triage; reachability analysis; routing; patch validation; regulatory-evidence packaging. The model is commodity; this is the firm-built system around it.
Kill-chain defence~35%~40%AI Security Gateway; agentic guardrails; deepfake-resistant identity; detection engineering; SOC AI-telemetry onboarding
MRM, governance, audit infrastructure~15%~15%SR 11-7 expansion; OSFI E-23 readiness; independent validation function; internal audit cycle
Threat intelligence, red-team automation, sectoral cooperation~10%~15%Glasswing 90-day reports; OpenAI Daybreak Trusted Access; FS-ISAC AI working group; sectoral cooperative once stood up

The seeming paradox — capability is commodity, so why 40% of spend? — resolves cleanly: the 40% is not buying capability. It is buying the orchestration layer that converts commodity capability into a bank-specific defensive advantage. The model is a line item, not a budget category.

1.10 Vendor failure modes at a NA bank — five recurring

  1. Evidentiary granularity for regulators. NYDFS 72-hour, OSFI 24-hour, Reg S-P 30-day windows do not accommodate vendor support latency.
  2. MNPI / information-barrier enforcement at prompt level. Vendor lacks firm’s deal-codename watchlist, restricted-name map, Chinese-wall partition.
  3. Cross-border inference routing. Vendor SaaS rarely exposes per-call routing controls required by PIPEDA, Quebec Law 25, OSFI B-10.
  4. Audit-trail granularity to SR 11-7 / OSFI E-23 standards. Vendor logs rarely reconstruct the full decision path of a multi-tool agent.
  5. RTOs assume vendor speed. A frontier-class adversary compresses detect-to-exploit; FFIEC AIO / OSFI B-13 D4 RTOs require AI-aware tightening a vendor cannot impose externally.

1.11 Phased capability roadmap

Months 0–6 (May–Nov 2026): close exploitable gaps. Enforced prohibition on MNPI/PII/trading-book data in non-sanctioned public LLMs with telemetric verification (NYDFS 500.7/.10; OSFI E-23 §inventory). AI inventory satisfying NYDFS 500.13 and laying the foundation for OSFI E-23 §model inventory. FIDO2/passkey completion for wire-system, trading-system, MNPI, and admin-console users. Out-of-band callback verification for wires above firm-set thresholds initiated by phone/video/email. Anthropic and OpenAI threat-intel reporting cadence onboarded to the SOC. Mythos-class tabletop at CISO + COO + Legal + Comms + Head of Trading. Initial SOC AI-telemetry onboarding to enable basic Year-1 detection use cases.

Months 6–18 (Nov 2026–Nov 2027): build the system. AI Security Gateway as a mandatory egress chokepoint for every internal LLM call — this is when the LLM-gateway detection use cases in Part 2 become operational. Agentic guardrail framework with an action-class taxonomy, per-class approval thresholds, human-in-the-loop for write-external/move-money/trade/grant-entitlement, kill-switch with RTO ≤ 60 seconds tested quarterly, and an externally-governed escalation channel. MRM policy update bringing all GenAI/agent models under SR 11-7 + OSFI E-23. Discovery-to-remediation orchestration operational over the firm codebase. ATLAS-mapped continuous adversarial emulation against deployed agents.

Months 18–36 (Nov 2027–May 2029): institutionalise. AI risk metrics integrated into board reporting at credit/market/operational-risk cadence. Multi-vendor AI strategy reducing concentration. Exit, portability, and capability-restriction triggers in FDE contracts. Firm-controlled red-team-as-a-service using the Daybreak Trusted Access tier or equivalent. Canadian sectoral AI threat-sharing cooperative aligned with OSFI incident reporting and the CFRG.

Steady state (May 2029 onward). AI risk indistinguishable from operational, cyber, and model risk in board-level reporting. AI Security Gateway, agentic guardrails, MRM 2.0, and discovery-to-remediation orchestration operate as routine controls subject to annual internal audit and supervisory examination. The acute capability arbitrage of 2026–2027 is closed; the firm’s competitive position is determined by ongoing remediation throughput and the maturity of its decision-rights framework for agentic actions.

1.12 Governance structure

Extend the existing risk-committee fabric rather than building a parallel AI committee. The parallel approach produces fragmentation and ambiguous decision rights.

BodyMandateReportingAnchor
Board Risk CommitteeAI risk appetite statement; quarterly review of KRIs, top scenarios, control maturityAnnual review of appetiteNYDFS 500.4(d); OSFI Corporate Governance Guideline
Executive AI Risk CommitteeCross-LOB; CRO + CISO co-chair; CIO, CDO, GC, Compliance, Op Risk, Model Risk, Privacy, Internal AuditMonthly to BRC chair; quarterly to BRCNYDFS 500.4(b); SR 11-7; OSFI E-23 §1
Model Risk function (2LoD)Independent validation of all GenAI/agent modelsQuarterlySR 11-7; OSFI E-23
CISO + AI Security EngineeringAI Security Gateway, agent guardrail framework, red-team automationAnnual + on material eventsNYDFS 500.4; OSFI B-13
Internal Audit (3LoD)Annual audit of AI governance, model risk, AI securityAnnual to Audit CommitteeOCC heightened standards; OSFI Three Lines

Part 2 — the Red-Team Playbook — turns this strategic picture into operational testing: a catalogue of attack vectors, banking-specific tabletops, and the detection engineering to catch them.

Bring this rigor to your own AI controls.

If this series maps to a problem on your desk, a short call is the fastest way to compare notes.