Anthropic rankles users with safety-first Fable release
Anthropic’s latest AI model might be the company’s most powerful public release, but the system’s strict safety measures quickly triggered some of the strongest backlash the AI giant has faced.
Subscribe to read this story ad-free
Get unlimited access to ad-free articles and exclusive content.
Many users, some of whom have marveled at Anthropic’s previous announcements, torched the company for debuting its Fable 5 model on Tuesday with what they say are overly stringent guardrails. In some cases, when the model classified a query as potentially sensitive, it would provide a lower-quality answer without informing the user of the downgrade.
After the outcry, Anthropic backtracked and reversed some of its most conservative decisions less than two days after Fable 5’s release, highlighting growing concerns about AI companies’ ability to unilaterally limit users’ access to helpful AI-generated information.
“You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right,” Anthropic wrote on X early Thursday.
Nathan Lambert, a leading AI researcher who champions collaborative approaches to building AI systems, wrote that with the cautious debut, “Anthropic has made it pretty clear that they only trust themselves as the mediators of cutting-edge AI research.”
Anthropic’s Fable 5 system is the first consumer-facing system from Anthropic’s Mythos family of models. An early, nonpublic version of Mythos spooked policymakers and corporate executives in April for its ability to find more than 10,000 severe bugs and vulnerabilities in important software systems.
Anthropic fears that powerful AI models like Mythos could allow bad actors to use AI systems to commit crimes, from launching crippling cyberattacks against critical infrastructure to designing bioweapons that could kill masses of people.
As a result, Anthropic released Fable 5 with a strict set of guardrails preventing the model from answering a range of questions about cybersecurity or biology. Acknowledging its decision to err on the side of caution, Anthropic said that Fable 5’s safety-first approach might incorrectly flag harmless requests as being suspicious for less than 5% of queries.
“With more capable models arriving in the coming months, we’re working to improve our safeguards and reduce false positives as quickly as we can.,” Anthropic wrote.
Preliminary tests of the model conducted by NBC News, along with many examples shared on social media, found Anthropic’s protections took a broad view of potential suspicious activity, rendering the system useless for many mundane queries.
For example, the model refused NBC News’ requests to offer opinions on Elon Musk and Anthropic CEO Dario Amodei, asserting that the questions might be dangerous. Fable 5 also declined to answer many innocent biology-related questions, from queries about open issues in cancer research to which medical exams might best identify pancreatic injuries.
For those Fable 5 queries flagged as dangerous, Anthropic instead routes questions to a less-powerful system called Claude Opus 4.8, which had been the top-of-the-line system until Fable’s release on Tuesday.
Because it has been deployed for months, Opus 4.8 has a better ability to handle and redirect questions that could be seen as harmful. Opus 4.8 offered basic but clear answers to NBC News’ questions that Fable 5 had refused.
Anthropic is also worried that competitors could use Anthropic’s AI systems to turbocharge their own research — Anthropic uses its own AI systems to help create the next generation of its models.
To prevent other AI companies from using Fable 5 to improve their own AI products or research, Anthropic said on Tuesday that it would include safeguards to make Fable 5’s answers less intelligent or useful for user questions that might be related to AI development.
However, Anthropic said that these specific guardrails — unlike the cybersecurity and biology guardrails — would be invisible, since making them clear to users could allow competitors to more easily circumvent the barriers.
The move triggered immediate uproar, with some charging that such invisible guardrails were unfair and unethical. “You don’t want to go down as the first company to enable and open the door for human-designed AI manipulation at scale,” wrote leading AI researcher Clement Delangue on X, highlighting that Anthropic’s decision to invisibly degrade answers related to AI development would be “the highest form of manipulation.”
Anthropic reacted quickly, updating its rules early Thursday morning to make such safeguards visible. Wired first reported Anthropic’s reversal.
Peter Wallich, a senior research manager at a AI safety research center called the Constellation Institute, said that the rocky rollout was less than ideal but sensible given the powerful technology.
“It’s clearly frustrating for security researchers and biologists to be kicked back to Opus for innocuous tasks, which is a real cost,” Wallich told NBC News, emphasizing that he spoke in a personal capacity. “But this still seems to me like a reasonable trade-off. Regular people get the model earlier than they would otherwise.”
Wallich said that a more lax approach — prioritizing earlier public access to Fable 5’s full power without adequate protections — would be a far worse scenario. “The opposite failure mode, shipping with safeguards that were too loose, would be irresponsible and could have led to irreversible harm,” he said.
You may be interested

Former Raiders WR Henry Ruggs denied parole in deadly 2021 DUI case
new admin - Jun 11, 2026[ad_1] NEWYou can now listen to Fox News articles! The people hoping Henry Ruggs III could be paroled on Thursday…

She confided in ChatGPT the night of her suicide. Now, her mother is suing OpenAI.
new admin - Jun 11, 2026About 18 months before Alice Carrier's death in July 2025, the 24-year-old had been confiding in ChatGPT about her relationship problems and…

U.K. defense secretary resigns, saying the government isn’t willing to spend enough on the military
new admin - Jun 11, 2026LONDON — U.K. Defense Secretary John Healey unexpectedly quit on Thursday, saying the government is unwilling to spend enough on…

































