Why Anthropic's new AI model has some cybersecurity pros worried about its hacking abilities

· Business Insider

Anthropic CEO Dario Amodei.
  • Anthropic said it isn't releasing its newest model, Claude Mythos, due to cybersecurity misuse fears.
  • Mythos can autonomously detect and exploit cybersecurity flaws at scale, Anthropic said.
  • "Fundamentally, this model seems incredibly impressive and will only improve over time," one expert said.

Anthropic's AI releases have stoked fears of a software apocalypse. Now it says it's not releasing its new model, Claude Mythos Preview, to the public because it's concerned it might unleash chaos on the cybersecurity world.

Visit rouesnews.click for more information.

In a Tuesday blog post, Anthropic said Mythos could autonomously find, analyze, and exploit software vulnerabilities at scale — in some cases better than humans.

Calling it a "watershed moment," Anthropic said Mythos is so powerful that non-cybersecurity professionals could use it "to find and exploit sophisticated vulnerabilities."

Cyberspace experts told Business Insider that while Anthropic's announcement has some deliberate marketing language, the model appears to be a big leap in AI's capabilities in the cyber world.

"Anthropic has built its reputation as the 'safety first' AI company, so announcements like this serve two purposes: genuine caution and signaling its safety-conscious stance," Jake Moore, global cybersecurity specialist at ESET, told Business Insider.

"Fundamentally, this model seems incredibly impressive and will only improve over time," Moore added.

Method in the Mythos

Anthropic said that during its testing period, Mythos detected "thousands" of critical security flaws, including zero-day vulnerabilities, which have no immediate fixes.

For comparison, elite teams of humans working on these problems discover around 100 of these a year, said Ofer Amitai, a cofounder of the startup Onit Security. "So it's roughly 10-100x the output of a top human team, and compresses exploit development from weeks to hours," he added.

Large language models (LLMs), the technology underpinning AI like Mythos, have become incredibly proficient at coding because it has strict rules and patterns. That also applies to cybersecurity, said Erik Bloch, the vice president of information security at Ilumio.

"LLMs are fundamentally language engines, and code is just another language," Bloch said. "That's why it's not surprising they can find bugs and vulnerabilities that humans or rule‑based tools miss, especially subtle, logic‑level issues."

There are questions around costs and scalability, though. Anthropic said finding a 27-year-old vulnerability in one operating system cost $20,000 after running Mythos thousands of times.

"Given costs, does that scale?" said Kev Breen, senior director of cyber threat research at Immersive. "Where do you start? Do humans scale more affordably than AI agents do?"

Offense vs defense

Cybersecurity is a continuous game of cat and mouse between those trying to break in and those trying to keep attackers out. Which side does Mythos benefit most?

In a world where a tool like Mythos was publicly available, attackers would benefit more in the short term, cybersecurity experts say.

"They can generate highly targeted phishing, convincing deepfakes, or workable exploit chains at the push of a button," said Mike Britton, the chief information officer at Abnormal AI.

Then, as defenders adopted such tools, they would gain the edge.

"Tools built on Mythos-class capabilities will let them find, triage, and patch vulnerabilities far faster across the whole lifecycle, shifting the advantage back toward defense," said Amitai.

Anthropic said its own tests on Mythos included encouraging the AI to break out of a virtual sandbox. An Anthropic researcher said they were subsequently sent an "unexpected email from the model while eating a sandwich in the park."

"If the capabilities being presented here really are substantive and not marketing hype, then I for one have some serious concerns about where we're going to end up," Dan Andrew, the head of security at Intruder, told Business Insider.

For now, Anthropic is making a preview version of Claude Mythos available to select companies — including Google, Microsoft, JPMorgan Chase, and CrowdStrike — to help test it in a controlled environment in what it's calling "Project Glasswing."

"The fallout — for economies, public safety, and national security — could be severe," Anthropic said. "Project Glasswing is an urgent attempt to put these capabilities to work for defensive purposes."

Andrew said that while this sounds "scary," he believes that Anthropic thinks the risk is real because they "aren't the worst offenders in hype-versus-substance."

Read the original article on Business Insider

Read full story at source