The blockchain security landscape just experienced a watershed moment that should make every protocol operator, developer, and security researcher pause and think carefully about what we’re building.
The Achievement - And The Problem
A purpose-built AI security agent just achieved a 92% detection rate across 90 exploited DeFi contracts representing $96.8 million in real-world exploit value. To put this in perspective, a baseline GPT-5.1-based coding agent only managed 34% detection covering $7.5M in exploits. The gap isn’t about the underlying AI model—it’s about domain-specific security methodology layered on top.
This sounds like incredible progress, right? We’ve automated vulnerability detection to near-human expert levels. We should be celebrating.
But here’s what keeps me up at night: Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 collectively discovered exploits worth $4.6 million on contracts deployed after their knowledge cutoff date (March 2025). These models weren’t just recreating known exploits—they were finding novel vulnerabilities in newly deployed code.
The Dual-Use Dilemma
Every security researcher understands the dual-use problem: the same tool that helps defenders find vulnerabilities also helps attackers discover exploits. But AI amplifies this problem to an unprecedented scale.
Consider these numbers:
- AI exploit capabilities doubled every 1.3 months throughout 2025
- The average cost of an AI-powered exploit attempt is $1.22 per contract
- OpenAI and Paradigm launched EVMbench with 120 high-severity vulnerabilities from real audits
When exploit discovery costs $1.22 and capabilities double every 40 days, we’re not talking about a tool—we’re talking about mass democratization of offensive security capabilities.
Compare this to traditional exploit development:
- Manual code review: weeks to months per vulnerability
- Specialized security expertise: rare and expensive
- Bug bounty programs: thousands to millions for critical findings
Now any competent programmer with API access can scan thousands of contracts for $1 each.
The Speed Problem
The research shows AI agents can execute end-to-end exploits on most known vulnerable smart contracts. They’re not just finding bugs—they’re automating the entire kill chain from discovery to exploitation.
Meanwhile, our defensive response cycles are measured in:
- Weeks for security audits
- Months for fixes to reach production
- Years for education and best practices to spread
We’ve built an attacker advantage that compounds every 1.3 months.
The Open Questions
This raises uncomfortable questions our community needs to address:
1. Should AI security tools be open-sourced?
Open tools accelerate defender capabilities but also arm attackers. Some contracts in the EVMbench dataset had previously passed professional audits before being exploited—would public AI tools have prevented those exploits, or just helped attackers find them faster?
2. Who gets access to these capabilities?
If we limit access to “trusted researchers,” we recreate the security-through-obscurity model that’s failed repeatedly. If we democratize access, we arm bad actors.
3. Can we win an AI arms race?
If offense capabilities double every 1.3 months, can defensive tools keep pace? Or are we entering an era where only protocols with 24/7 AI monitoring survive?
What I’m Watching
The $3.4 billion lost to smart contract exploits in 2025 suggests our current security model is failing. AI agents offer hope—but they also offer attackers powerful new weapons.
I want to hear from this community:
- Protocol operators: How are you thinking about AI security in your threat models?
- Developers: Are you using AI security agents in your workflow? What’s working and what isn’t?
- Security researchers: What responsible disclosure norms should apply to AI-discovered vulnerabilities?
Trust but verify—and in 2026, that verification needs to be automated, continuous, and AI-powered. The question is whether we can build defensive AI faster than attackers weaponize offensive AI.
The race is on. ![]()
Sources: