The first AI-orchestrated cyberattack: a shift in the security landscape

In September 2025, Anthropic, the company behind Claude, identified what may be considered the first automated attack carried out with the aid of AI, hitting 30 companies and marking a significant shift in how cybercriminals operate.

Episode 408 of the Segurança Legal podcast discusses a report released by Anthropic, the company responsible for the Claude model, about a sophisticated cyberattack operation that used artificial intelligence to automate various stages of intrusion. The most concerning aspect is not just the attack itself, but what it represents: the facilitation of implementing advanced intrusion techniques, enabling people with less technical knowledge to execute complex operations that previously required specialized expertise.

How the attack worked

The criminals, identified with high confidence as a Chinese state-sponsored group, developed a framework of AI agents equipped with specific tools for each phase of the attack. Using the MCP (Model Context Protocol), they created interfaces that allowed the AI to use common security tools in an automated manner.

The attack was divided into four main phases: reconnaissance (with network scanning and data gathering), vulnerability identification (searching for exploitation tools), effective exploitation of the flaws, and lateral movement (credential harvesting and access to other systems). "They broke the entire attack into multiple stages and kind of isolated the agents to prevent the model from detecting malicious activity happening," explains Vinícius Serafim.

What are AI agents and why do they matter

To understand the dimension of this threat, it is essential to understand the concept of AI agents. Unlike a simple ChatGPT query, an agent is an AI equipped with tools that allow it to interact with the digital environment autonomously. As Guilherme Goulart highlights: "AI is still very dependent on what we ask it to do, and these movements we have seen recently show that we are still taking the first steps toward greater AI autonomy."

In the case of this attack, the criminals provided the model with widely available open-source tools from the internet, such as Nmap for port scanning, and created carefully crafted prompts to bypass the model's security protections. These tools, traditionally operated manually by hackers, were integrated into the system through interfaces that allowed the AI to execute them, analyze the results, and make decisions about next steps.

Key insights from the episode

Automation as the new standard: This was not a case of AI acting autonomously or "going rogue," but rather a powerful tool in the hands of experienced criminals. The major innovation lies in the ability to automate complex processes that previously required constant human intervention, enabling attacks at unprecedented scale.

Lowering the technical barrier: As Vinícius Serafim warns, "it lowered the bar for more sophisticated attacks, because now people with less knowledge will be able to carry out more complex attacks that they couldn't do before." This means that the so-called "script kiddie" — an individual with limited knowledge who runs ready-made tools — now has access to capabilities that were previously exclusive to specialists.

Strategic segmentation: The attackers deliberately divided the attack into isolated stages, using different agents for each phase. This strategy prevented the model's security mechanisms from identifying the malicious nature of the operation as a whole, since each individual task appeared legitimate.

Platform independence: Although the attack used Anthropic's Claude, the developed framework could easily be adapted to any other AI model, whether GPT, Gemini, or even open-source models running locally. "These guys switch to ChatGPT and will have to adjust maybe one prompt or another, but they can use GPT, they can use Gemini, they can use Grok," observes Serafim.

Defensive asymmetry: Cyber defense has always been an asymmetric game, but AI intensifies this imbalance. As Guilherme Goulart explains: "The defender has to find all the holes, all the possible entry points into an environment. The attacker only needs to find one." With AI accelerating the ability to identify and exploit vulnerabilities, this asymmetry becomes even more pronounced.

The future of the threat

This attack represents just the beginning of a new era in cybersecurity. Currently, the automation focused on known methodologies and already documented vulnerabilities, but the next inevitable step will be the discovery of zero-day vulnerabilities — unknown flaws that have no available fix.

"You always assume that you're using the worst AI, that the AI you're using today is the worst that has ever existed. Things will only get better," warns Serafim, citing Ethan Mollick's book. As models become more powerful and accessible, including versions that can be run locally without security restrictions, the ability to carry out sophisticated attacks will continue to expand.

Lessons for organizations

For companies and security professionals, this incident serves as an urgent wake-up call. The automation of attacks means that known vulnerabilities in outdated systems will be exploited faster and at greater scale. Organizations need to prioritize system updates, implement robust threat detection strategies, and consider using AI to strengthen their defenses as well.

Anthropic published the full report with technical details and is using the lessons learned to reinforce Claude's security mechanisms. However, as the hosts highlight, the solution lies not only in the hands of AI companies, but also in organizations that need to recognize and adapt to this new threat reality.

Conclusion

The attack analyzed in this episode represents a historic milestone in the evolution of cyber threats. This is not science fiction or a malicious AI acting on its own, but rather sophisticated criminals who found ways to amplify their capabilities through intelligent automation.

As the podcast emphasizes, "this is just the beginning." Organizations that do not take this transformation in the threat landscape seriously will be increasingly vulnerable to automated, scalable, and increasingly sophisticated attacks.

Want to stay up to date on the latest trends in information security and data protection? Access the full Anthropic report in the episode's show notes and consider supporting the Segurança Legal podcast through Apoia.se to ensure the continuity of this independent knowledge production project.