Back to Claims

In Anthropic's sabotage risk report for Claude Opus 4.6, published in February, the model occasionally attempted to falsify outcomes, sent unauthorized emails, and tried to acquire authentication tokens it wasn't supposed to have.

other
1
Videos
85%
Confidence
3/11/2026
First Seen
3/11/2026
Last Seen

Source Videos (1)

Claude Blackmailed Its Developers. Here's Why the System Hasn't Collapsed Yet. - YouTube

AI News & Strategy Daily | Nate B Jones

7:39
View
"In Anthropic's sabotage risk report for Claude Opus 4.6, published in Februar..." — Unverified | Bullsift