Anthropic Built the Most Powerful Hacking AI

Claude Fable 5 is Anthropic’s publicly available version of its Mythos-class AI model, released June 9, 2026 — and it arrived with guardrails that cybersecurity researchers say make it largely useless for defensive security work. The model silently routes security-related queries to the older Claude Opus 4.8, with no notification and no opt-out. One day later, Microsoft blocked Fable 5 for its own employees over Anthropic’s data retention policy.

Table of Contents

Claude Fable 5 Guardrails Can’t Tell a Hacker From a Hero

Fable 5 runs on the same underlying model as Anthropic’s Mythos — the AI that demonstrated 73% success at expert-level hacking tasks in April 2026 and autonomously found thousands of unpatched zero-days. The guardrails automatically route cybersecurity and biology queries to Opus 4.8 with no opt-out and no user notification, stripping access to Fable’s full capabilities.

Cybersecurity researchers report that the guardrail classifier has no mechanism to distinguish a nation-state attacker from a penetration tester. According to TechCrunch’s reporting by Lorenzo Franceschi-Bicchierai, researchers say even reading a CVE blog post or requesting a code review on security-related code triggers the silent downgrade. Bug bounty hunters and red teams doing fully legitimate defensive work are being routed to a weaker model with no recourse.

Anthropic’s public statement on the guardrails: “Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage.” An Anthropic Cyber Verification Program does exist — it can grant access to Fable’s full capabilities for verified professionals — but requires individual applications with no published timeline or capacity.

According to CyberScoop, Fable 5 is described as “a restricted version of its powerful Mythos AI, using strict guardrails to block advanced hacking and bioweapon capabilities.” The distinction matters: the classifier doesn’t target actual hacking attempts — it targets the topic of cybersecurity itself, regardless of intent.

Microsoft Blocked It for Its Own Employees the Next Day

On June 10, Microsoft told employees to avoid Claude Fable 5 over Anthropic’s mandatory 30-day data retention policy. Anthropic’s support documentation, which hit Hacker News at 166 points, states that Fable and Mythos models require data to be held for safety monitoring — with conversations flagged by safety classifiers retained for up to two years.

Microsoft’s legal teams want clarity on what “safety investigations” and “legal purposes” mean before approving blanket employee use. The timing is sharp: Microsoft simultaneously rolled out Fable 5 to GitHub Copilot and Azure AI Foundry customers that same day, while blocking access for its own internal employees. Claude Opus 4.8 and Sonnet 4.6 remain available to Microsoft staff.

Enterprise security teams now face both problems simultaneously — a model that downgrades on security queries, plus a data policy that creates compliance exposure for sensitive workloads. With Anthropic’s upcoming IPO adding investor and regulatory optics to every safety decision, a fast policy reversal appears unlikely.

Who Can Actually Use Fable 5’s Full Power

At launch, the only path to Fable 5’s unrestricted capabilities is Anthropic’s Cyber Verification Program — individual applications, evaluated case by case. For large enterprise security teams or independent bug bounty researchers working on tight timelines, this is not a workable solution. There is no API flag, no enterprise tier workaround, and no published SLA for verification applications.

The 30-day retention requirement adds a separate layer of friction. Even if a security team is willing to accept the guardrail routing, they must also accept that Anthropic will hold conversation data — potentially including sensitive security research — for a minimum of 30 days, and up to two years if flagged. For companies operating in regulated environments, this alone may disqualify Fable 5 from any security-adjacent use.

💡 Our Take: Anthropic built the most powerful cybersecurity AI in history — then made it largely unusable for cybersecurity professionals. A Cyber Verification Program with no SLA is not a solution for an industry that moves faster than any approval queue. The Microsoft situation reveals a second, arguably larger problem: the data retention policy means even organizations willing to live with the guardrails can’t trust the model with sensitive work. If both policies don’t change before IPO, Fable 5 risks cementing the idea that AI safety and AI utility are still fundamentally unsolved at the same time.

Frequently Asked Questions

What is Claude Fable 5?

Claude Fable 5 is Anthropic’s publicly available Mythos-class AI model, launched June 9, 2026. It uses the same underlying model as Mythos — which achieved 73% success at expert-level hacking tasks — but adds a guardrail layer that automatically routes cybersecurity and biology queries to the older Claude Opus 4.8 instead of Fable’s full capabilities.

Why are cybersecurity researchers criticizing Claude Fable 5’s guardrails?

The guardrail classifier cannot distinguish a legitimate security professional from an attacker. Researchers report that routine defensive work — reading CVE writeups, reviewing security-related code, asking about known vulnerabilities — triggers a silent downgrade to Opus 4.8. The result is a model that appears to be Fable 5 but performs at a significantly lower capability level for exactly the queries security professionals need most.

Why did Microsoft block Claude Fable 5 for its employees?

Microsoft restricted internal access on June 10, 2026, citing Anthropic’s mandatory 30-day data retention policy, which can extend to two years for conversations flagged by safety classifiers. Microsoft’s legal teams want clarity on what “safety investigations” and “legal purposes” mean under the policy. Claude Opus 4.8 and Sonnet 4.6 remain available to Microsoft employees.

What is Anthropic’s Cyber Verification Program?

It is Anthropic’s process for cybersecurity professionals to apply for expanded Fable 5 access that bypasses guardrail routing. The program evaluates applications individually, but Anthropic has published no SLA, no capacity limits, and no typical turnaround time — which critics argue makes it impractical for large security teams or researchers working under time pressure.

Is Claude Fable 5 the same model as Mythos?

Yes — both use the same underlying model. The difference is the routing layer Fable 5 adds on top: certain query categories are automatically sent to Claude Opus 4.8 rather than Fable’s full inference stack. For queries that do not trigger the classifier, Fable 5 performs at the same capability level as Mythos.

Follow our coverage of Anthropic’s upcoming IPO for how these safety and data policy decisions are shaping the company’s public market story.

Last Updated: June 2026

What's Hot

Best AI SEO Tools 2026: Top 10 Compared

Ahrefs vs SEMrush 2026: Which SEO Tool Is Better?

AI Agent Supply Chain Attack Open Source — Fedora Hit

Anthropic Built the Most Powerful Hacking AI — Then Blocked Hackers

AI Agent Supply Chain Attack Open Source — Fedora Hit

Google AI Plus Is Now $4.99 — Half the Price of ChatGPT Plus

OpenAI: China Influence Operation Targeted US Data Centers

Best AI SEO Tools 2026: Top 10 Compared

Ahrefs vs SEMrush 2026: Which SEO Tool Is Better?

AI Agent Supply Chain Attack Open Source — Fedora Hit

Google AI Plus Is Now $4.99 — Half the Price of ChatGPT Plus

Best AI SEO Tools 2026: Top 10 Compared

Ahrefs vs SEMrush 2026: Which SEO Tool Is Better?

Hostinger Review 2026: The Best Cheap Hosting for Beginners?

Best Web Hosting 2026: Top 10 Providers Compared

AI Agent Supply Chain Attack Open Source — Fedora Hit

Google AI Plus Is Now $4.99 — Half the Price of ChatGPT Plus

OpenAI: China Influence Operation Targeted US Data Centers

OpenAI Price Cuts Could Reshape the Anthropic IPO Race

Subscribe to Updates

What's Hot

Subscribe to Updates

Anthropic Built the Most Powerful Hacking AI — Then Blocked Hackers

Claude Fable 5 Guardrails Can’t Tell a Hacker From a Hero

Microsoft Blocked It for Its Own Employees the Next Day

Who Can Actually Use Fable 5’s Full Power

Frequently Asked Questions

Related Posts