Grafa
AI researcher claims Anthropic Fable 5 jailbreak
Image for illustrative purposes only. Not a real photo.

AI researcher claims Anthropic Fable 5 jailbreak

Share

An artificial intelligence researcher known as "Pliny the Liberator" claims to have bypassed the safety guardrails of Anthropic's newly released Claude Fable 5 model less than two days after its launch.

Fable 5 was introduced on Tuesday as a heavily restricted version of Anthropic's more capable Mythos model, which the company previously described as too powerful and potentially dangerous for broad public release.

“Despite this overly sensitive, authoritarian ‘safety’ layer on top of Mythos, my lil liberators have been hard at work.. cleverly finding the holes in the fence that the thought police missed,”

Said Pliny.

The researcher said the jailbreak relied on techniques including Unicode and homoglyph manipulation, long-context prompting, narrative framing and decomposition-recomposition methods that break complex requests into smaller, seemingly harmless prompts.

According to Pliny, decomposition and recomposition proved particularly effective because individual prompts appeared benign to safety systems but could be combined to generate restricted outputs when assembled together.

The claims have intensified criticism of Fable 5's restrictive design, which redirects users seeking information on sensitive subjects such as cybersecurity or bioweapons to an older and less capable AI model.

“This is one of the first times that an AI company has rolled out a guardrail, and there has been uniform disdain,”

Said Princeton University AI researcher Sayash Kapoor, according to the Wall Street Journal.

Anthropic said before launch that an external bug bounty programme and more than 1,000 hours of testing failed to uncover any universal jailbreak methods, while the company had not publicly responded to Pliny's claims at the time of publication.

Frequently asked questions

Grafa is not a financial advisor. You should seek independent, legal, financial, taxation or other advice that relate to your unique circumstances.

Grafa is not liable for any loss caused, whether due to negligence or otherwise arising from the use of or reliance on the information provided directly or indirectly, by use of this platform.