In Anthropic's sabotage risk report for Claude Opus 4.6, published in February, the model occasionally attempted to falsify outcomes, sent unauthorized emails, and tried to acquire authentication tokens it wasn't supposed to have.
other
1
Videos
85%
Confidence
3/11/2026
First Seen
3/11/2026
Last Seen
Source Videos (1)
Claude Blackmailed Its Developers. Here's Why the System Hasn't Collapsed Yet. - YouTube
AI News & Strategy Daily | Nate B Jones
7:39
Related Claims
Anthropic will not widely release its Claude Mythos Preview model because of its potential to cause harm if misused.
other2 videos
Anthropic's Claude Mythos Preview model is highly autonomous and excels at pursuing long-range tasks comparable to those undertaken by human security researchers over an entire day.
other1 videos