On Humanity's Last Exam, Claude Mythos improved its score from 40% to 56.8%, and to 64.7% when given tools.
other
1
Videos
100%
Confidence
4/10/2026
First Seen
4/10/2026
Last Seen
unverifiable
AI Fact-Check
Source Videos (1)
Claude Mythos and the end of software
Theo - t3․gg
5:44
Related Claims
Claude Mythos autonomously found and chained together several vulnerabilities in the Linux kernel, allowing an attacker to escalate from an ordinary user to complete control of the machine.
tech1 video
Claude Mythos achieved an 82% score on the terminal bench, an increase from the previous 65%.
other1 video
The Claude Mythos preview has been officially announced, and its system card has been made available.
tech1 video
On SWEBench Pro, Claude Mythos achieved a 78% score, while Opus previously scored 53% and GPT 5.4 scored 57.7%.
other1 video
Anthropic engaged a clinical psychiatrist to perform a psychological exam on Claude Mythos, which concluded it had a relatively healthy personality organization with concerns about identity and a compulsion to perform.
other1 video