Skip to main content

4 posts tagged with "Smart Contracts"

Smart contract development and security

View all tags

BTCFi Awakening: The Race to Bring DeFi to Bitcoin

· 10 min read
Dora Noda
Software Engineer

Bitcoin has sat on the sidelines of the DeFi revolution for years. While Ethereum and its Layer 2 ecosystem accumulated over $100 billion in total value locked, Bitcoin—the original cryptocurrency with a $1.7 trillion market cap—remained largely idle. Only 0.8% of all BTC is currently utilized in DeFi applications.

That's changing fast. The BTCFi (Bitcoin DeFi) sector has exploded 22x from $300 million in early 2024 to over $7 billion by mid-2025. More than 75 Bitcoin Layer 2 projects are now competing to transform BTC from "digital gold" into a programmable financial layer. The question isn't whether Bitcoin will have DeFi—it's which approach will win.

The Problem BTCFi Solves

To understand why dozens of teams are racing to build Bitcoin Layer 2s, you need to understand Bitcoin's fundamental limitation: it wasn't designed for smart contracts.

Bitcoin's scripting language is intentionally simple. Satoshi Nakamoto prioritized security and decentralization over programmability. This made Bitcoin incredibly robust—no major protocol hack in 15 years—but it also meant that anyone wanting to use BTC in DeFi had to wrap it first.

Wrapped Bitcoin (WBTC) became the de facto standard for bringing Bitcoin to Ethereum. At its peak, over $14 billion worth of WBTC circulated through DeFi protocols. But wrapping introduced serious risks:

  • Custodian risk: BitGo and other custodians hold the actual Bitcoin, creating single points of failure
  • Smart contract risk: The March 2023 Euler Finance hack resulted in $197 million in losses, including significant WBTC
  • Bridging risk: Cross-chain bridges have been responsible for some of the largest DeFi exploits in history
  • Centralization: The 2024 WBTC custody controversy, involving Justin Sun and multi-jurisdictional restructuring, shook user confidence

BTCFi promises to let Bitcoin holders earn yield, lend, borrow, and trade without surrendering custody of their BTC to centralized parties.

The Major Contenders

Babylon: The Staking Giant

Babylon has emerged as the dominant force in BTCFi, with $4.79 billion in TVL as of mid-2025. Founded by Stanford professor David Tse, Babylon introduced a novel concept: using Bitcoin to secure Proof-of-Stake networks without wrapping or bridging.

Here's how it works: Bitcoin holders stake their BTC using "Extractable One-Time Signatures" (EOTS). If a validator behaves honestly, the stake remains untouched. If they act maliciously, the EOTS mechanism enables slashing—automatically burning a portion of the staked Bitcoin as punishment.

The genius is that users never give up custody. Their Bitcoin stays on the Bitcoin blockchain, timestamped and locked, while providing economic security to other networks. Kraken now offers Babylon staking with up to 1% APR—modest by DeFi standards, but significant for a trustless Bitcoin yield product.

In April 2025, Babylon launched its own Layer 1 chain and airdropped 600 million BABY tokens to early stakers. More importantly, a partnership with Aave will enable native Bitcoin collateral on Aave V4 by April 2026—potentially the most significant bridge between Bitcoin and DeFi yet.

Lightning Network: The Payment Veteran

The oldest Bitcoin Layer 2 is experiencing a renaissance. Lightning Network capacity hit an all-time high of 5,637 BTC (roughly $490 million) in late 2025, reversing a year-long decline.

Lightning excels at what it was designed for: fast, cheap payments. Transaction success rates exceed 99.7% in controlled deployments, with settlement times under 0.5 seconds. The 266% year-over-year increase in transaction volume reflects growing merchant adoption.

But Lightning's growth is increasingly institutional. Large exchanges like Binance and OKX have deposited significant BTC into Lightning channels, while the number of individual nodes has actually declined from 20,700 in 2022 to around 14,940 today.

Lightning Labs' Taproot Assets upgrade opens new possibilities, allowing stablecoins and other assets to be issued on Bitcoin and transferred via Lightning. Tether's $8 million investment in Lightning startup Speed signals institutional interest in stablecoin payments over the network. Some analysts project Lightning could handle 30% of all BTC transfers for payments and remittances by the end of 2026.

Stacks: The Smart Contract Pioneer

Stacks has been building Bitcoin smart contract infrastructure since 2017, making it the most mature programmable Bitcoin layer. Its Clarity programming language was specifically designed for Bitcoin, enabling developers to build DeFi protocols that inherit Bitcoin's security.

TVL on Stacks exceeded $600 million by late 2025, driven primarily by sBTC—a decentralized Bitcoin peg—and the ALEX decentralized exchange. Stacks anchors its state to Bitcoin through a process called "stacking," where STX token holders earn BTC rewards for participating in consensus.

The trade-off is speed. Stacks block times follow Bitcoin's 10-minute rhythm, making it less suitable for high-frequency trading applications. But for lending, borrowing, and other DeFi primitives that don't require split-second execution, Stacks offers battle-tested infrastructure.

BOB: The Hybrid Approach

BOB (Build on Bitcoin) takes a different approach: it's simultaneously an Ethereum rollup (using the OP Stack) and a Bitcoin-secured network (via Babylon integration).

This hybrid architecture gives developers the best of both worlds. They can build using familiar Ethereum tools while settling to both Bitcoin and Ethereum for enhanced security. BOB's upcoming BitVM bridge promises trust-minimized BTC transfers without relying on custodians.

The project has attracted significant developer interest, though TVL remains smaller than the leaders. BOB represents a bet that the future of BTCFi will be multi-chain rather than Bitcoin-native.

Mezo: The HODL Economy

Mezo, backed by Pantera Capital and Multicoin, introduced an innovative "Proof of HODL" consensus mechanism. Instead of rewarding validators or stakers, Mezo rewards users for locking BTC to secure the network.

The HODL Score system quantifies user commitment based on deposit size and duration—locking for 9 months yields 16x rewards compared to shorter periods. This creates natural alignment between network security and user behavior.

Mezo's TVL surged to $230 million in early 2025, driven by its EVM compatibility, which allows Ethereum developers to build BTCFi applications with minimal friction. Partnerships with Swell and Solv Protocol have expanded its ecosystem.

The Numbers: BTCFi by the Data

The BTCFi landscape can be confusing. Here's a clear snapshot:

Total BTCFi TVL: $7-8.6 billion (depending on measurement methodology)

Top Projects by TVL:

  • Babylon Protocol: ~$4.79 billion
  • Lombard: ~$1 billion
  • Merlin Chain: ~$1.7 billion
  • Hemi: ~$1.2 billion
  • Stacks: ~$600 million
  • Core: ~$400 million
  • Mezo: ~$230 million

Growth Rate: 2,700% increase from $307 million in early 2024 to $8.6 billion by Q2 2025

Bitcoin in BTCFi: 91,332 BTC (approximately 0.46% of all Bitcoin in circulation)

Funding Landscape: 14 public Bitcoin L2 financings totaling over $71.1 million, with Mezo's $21 million Series A being the largest

The TVL Controversy

Not all TVL claims are created equal. In January 2025, leading Bitcoin ecosystem projects including Nubit, Nebra, and Bitcoin Layers published a "Proof of TVL" report exposing widespread problems:

  • Double counting: The same Bitcoin counted across multiple protocols
  • Fake locking: TVL claims without actual on-chain verification
  • Opaque methodology: Inconsistent measurement standards across projects

This matters because inflated TVL numbers attract investors, users, and developers based on false premises. The report called for standardized asset transparency verification—essentially, proof of reserves for BTCFi.

For users, the implication is clear: dig deeper than headline TVL numbers when evaluating Bitcoin L2 projects.

What's Missing: The Catalyst Problem

Despite impressive growth, BTCFi faces a fundamental challenge: it hasn't found its killer application yet.

The Block's 2026 Layer 2 Outlook noted that "launching the same existing primitives seen on EVM-based L2s on a BTC chain is not enough to attract liquidity or developers." Bitcoin L2 TVL actually declined 74% from its 2024 peak, even as headline BTCFi numbers grew (largely due to Babylon's staking product).

The Ordinals narrative that sparked the 2023-2024 Bitcoin L2 boom has faded. BRC-20 tokens and Bitcoin NFTs generated excitement but not sustainable economic activity. BTCFi needs something new.

Several potential catalysts are emerging:

Native Bitcoin Lending: Babylon's BTCVaults initiative and the Aave V4 integration could enable Bitcoin-collateralized borrowing without wrapping—a massive market if it works trustlessly.

Trustless Bridges: BitVM-based bridges like BOB's could finally solve the wrapped Bitcoin problem, though the technology remains unproven at scale.

Stablecoin Payments: Lightning Network's Taproot Assets could enable cheap, instant stablecoin transfers with Bitcoin's security, potentially capturing remittance and payments markets.

Institutional Custody: Coinbase's cbBTC and other regulated alternatives to WBTC could bring institutional capital that has avoided BTCFi due to custody concerns.

The Elephant in the Room: Security

Bitcoin L2s face a fundamental tension. Bitcoin's security comes from its simplicity—any added complexity introduces potential vulnerabilities.

Different L2s handle this differently:

  • Babylon keeps Bitcoin on the main chain, using cryptographic proofs rather than bridges
  • Lightning uses payment channels that can always be settled back to Layer 1
  • Stacks anchors state to Bitcoin but has its own consensus mechanism
  • BOB and others rely on various bridge designs with different trust assumptions

None of these approaches are perfect. The only way to use Bitcoin with zero additional risk is to hold it in self-custody on Layer 1. Every BTCFi application introduces some trade-off.

For users, this means understanding exactly what risks each protocol introduces. Is the yield worth the smart contract risk? Is the convenience worth the bridging risk? These are individual decisions that require informed evaluation.

The Road Ahead

The BTCFi race is far from decided. Several scenarios could play out:

Scenario 1: Babylon Dominance If Babylon's staking model continues to grow and its lending products succeed, it could become the de facto BTCFi infrastructure layer—the Lido of Bitcoin.

Scenario 2: Lightning Evolution Lightning Network could evolve beyond payments into a full financial layer, especially if Taproot Assets gains traction for stablecoins and tokenized assets.

Scenario 3: Ethereum Integration Hybrid approaches like BOB or native Bitcoin collateral on Aave V4 could mean BTCFi happens primarily through Ethereum infrastructure, with Bitcoin serving as collateral rather than execution layer.

Scenario 4: Fragmentation The most likely near-term outcome is continued fragmentation, with different L2s serving different use cases. Lightning for payments, Babylon for staking, Stacks for DeFi, and so on.

What This Means for Bitcoin Holders

For the average Bitcoin holder, BTCFi presents both opportunity and complexity.

The opportunity: Earn yield on idle Bitcoin without selling it. Access DeFi functionality—lending, borrowing, trading—while maintaining BTC exposure.

The complexity: Navigating 75+ projects with varying risk profiles, understanding which TVL claims are legitimate, and evaluating trade-offs between yield and security.

The safest approach is patience. BTCFi infrastructure is still maturing. The projects that survive the next bear market will have proven their security and utility. Early adopters will earn higher yields but face higher risks.

For those who want to participate now, start with the most battle-tested options:

  • Lightning for payments (minimal additional risk)
  • Babylon staking through regulated custodians like Kraken (institutional custody, lower yield)
  • Stacks for those comfortable with smart contract risk on a mature platform

Avoid projects with inflated TVL claims, opaque security models, or excessive token incentives that mask underlying economics.

Conclusion

Bitcoin's DeFi awakening is real, but it's still early. The 22x growth in BTCFi TVL reflects genuine demand from Bitcoin holders who want to put their assets to work. But the infrastructure isn't mature, the killer application hasn't emerged, and many projects are still proving their security models.

The winners of the Bitcoin L2 race will be determined by which projects can attract sustainable liquidity—not through airdrops and incentive programs, but through genuine utility that Bitcoin holders actually want.

We're watching the foundation being laid for a potentially massive market. With less than 1% of Bitcoin currently in DeFi, the room for growth is enormous. But growth requires trust, and trust requires time.

The race is on. The finish line is still years away.


This article is for educational purposes only and should not be considered financial advice. Always conduct your own research before interacting with any DeFi protocol.

Zama Protocol: The FHE Unicorn Building Blockchain's Confidentiality Layer

· 11 min read
Dora Noda
Software Engineer

Zama has established itself as the definitive leader in Fully Homomorphic Encryption (FHE) for blockchain, becoming the world's first FHE unicorn in June 2025 with a $1 billion valuation after raising over $150 million. The Paris-based company doesn't compete with blockchains—it provides the cryptographic infrastructure enabling any EVM chain to process encrypted smart contracts without ever decrypting the underlying data. With its mainnet launched on Ethereum in late December 2025 and the $ZAMA token auction beginning January 12, 2026, Zama sits at a critical inflection point where theoretical cryptographic breakthroughs meet production-ready deployment.

The strategic significance cannot be overstated: while Zero-Knowledge proofs prove computation correctness and Trusted Execution Environments rely on hardware security, FHE uniquely enables computation on encrypted data from multiple parties—solving the fundamental blockchain trilemma between transparency, privacy, and compliance. Institutions like JP Morgan have already validated this approach through Project EPIC, demonstrating confidential tokenized asset trading with full regulatory compliance. Zama's positioning as infrastructure rather than a competing chain means it captures value regardless of which L1 or L2 ultimately dominates.


Technical architecture enables encrypted computation without trust assumptions

Fully Homomorphic Encryption represents a breakthrough in cryptography that has existed in theory since 2009 but only recently became practical. The term "homomorphic" refers to the mathematical property where operations performed on encrypted data, when decrypted, yield identical results to operations on the original plaintext. Zama's implementation uses TFHE (Torus Fully Homomorphic Encryption), a scheme distinguished by fast bootstrapping—the fundamental operation that resets accumulated noise in ciphertexts and enables unlimited computation depth.

The fhEVM architecture introduces a symbolic execution model that elegantly solves blockchain's performance constraints. Rather than processing actual encrypted data on-chain, smart contracts execute using lightweight handles (pointers) while actual FHE computations are offloaded asynchronously to specialized coprocessors. This design means host chains like Ethereum require no modifications, non-FHE transactions experience no slowdown, and FHE operations can execute in parallel rather than sequentially. The architecture comprises five integrated components: the fhEVM library for Solidity developers, coprocessor nodes performing FHE computation, a Key Management Service using 13 MPC nodes with threshold decryption, an Access Control List contract for programmable privacy, and a Gateway orchestrating cross-chain operations.

Performance benchmarks demonstrate rapid improvement. Bootstrapping latency—the critical metric for FHE—dropped from 53 milliseconds initially to under 1 millisecond on NVIDIA H100 GPUs, with throughput reaching 189,000 bootstraps per second across eight H100s. Current protocol throughput stands at 20+ TPS on CPU, sufficient for all encrypted Ethereum transactions today. The roadmap projects 500-1,000 TPS by end of 2026 with GPU migration, scaling to 100,000+ TPS with dedicated ASICs in 2027-2028. Unlike TEE solutions vulnerable to hardware side-channel attacks, FHE's security rests on lattice-based cryptographic hardness assumptions that provide post-quantum resistance.


Developer tooling has matured from research to production

Zama's open-source ecosystem comprises four interconnected products that have attracted over 5,000 developers, representing approximately 70% market share in blockchain FHE. The TFHE-rs library provides a pure Rust implementation with GPU acceleration via CUDA, FPGA support through AMD Alveo hardware, and multi-level APIs ranging from high-level operations to core cryptographic primitives. The library supports encrypted integers up to 256 bits with operations including arithmetic, comparisons, and conditional branching.

Concrete functions as a TFHE compiler built on LLVM/MLIR infrastructure, transforming standard Python programs into FHE-equivalent circuits. Developers require no cryptography expertise—they write normal Python code and Concrete handles the complexity of circuit optimization, key generation, and ciphertext management. For machine learning applications, Concrete ML provides drop-in replacements for scikit-learn models that automatically compile to FHE circuits, supporting linear models, tree-based ensembles, and even encrypted LLM fine-tuning. Version 1.8 demonstrated fine-tuning a LLAMA 8B model on 100,000 encrypted tokens in approximately 70 hours.

The fhEVM Solidity library enables developers to write confidential smart contracts using familiar syntax with encrypted types (euint8 through euint256, ebool, eaddress). An encrypted ERC-20 transfer, for example, uses TFHE.le() to compare encrypted balances and TFHE.select() for conditional logic—all without revealing values. The September 2025 partnership with OpenZeppelin established standardized confidential token implementations, sealed-bid auction primitives, and governance frameworks that accelerate enterprise adoption.


Business model captures value as infrastructure provider

Zama's funding trajectory reflects accelerating institutional confidence: a $73 million Series A in March 2024 led by Multicoin Capital and Protocol Labs, followed by a $57 million Series B in June 2025 led by Pantera Capital that achieved unicorn status. The investor roster reads as blockchain royalty—Juan Benet (Filecoin founder and board member), Gavin Wood (Ethereum and Polkadot co-founder), Anatoly Yakovenko (Solana co-founder), and Tarun Chitra (Gauntlet founder) all participated.

The revenue model employs BSD3-Clear dual licensing: technologies remain free for non-commercial research and prototyping, while production deployment requires purchasing patent usage rights. By March 2024, Zama had signed over $50 million in contract value within six months of commercialization, with hundreds of additional customers in pipeline. Transaction-based pricing applies for private blockchain deployments, while crypto projects often pay in tokens. The upcoming Zama Protocol introduces on-chain economics: operators stake $ZAMA to qualify for encryption and decryption work, with fees ranging from $0.005 - $0.50 per ZKPoK verification and $0.001 - $0.10 per decryption operation.

The team represents the largest dedicated FHE research organization globally: 96+ employees across 26 nationalities, with 37 holding PhDs (~40% of staff). Co-founder and CTO Pascal Paillier invented the Paillier encryption scheme used in billions of smart cards and received the prestigious IACR Fellowship in 2025. CEO Rand Hindi previously founded Snips, an AI voice platform acquired by Sonos. This concentration of cryptographic talent creates substantial intellectual property moats—Paillier holds approximately 25 patent families protecting core innovations.


Competitive positioning as the picks-and-shovels play for blockchain privacy

The privacy solution landscape divides into three fundamental approaches, each with distinct trade-offs. Trusted Execution Environments (TEEs), used by Secret Network and Oasis Network, offer near-native performance but rely on hardware security with a trust threshold of one—if the enclave is compromised, all privacy breaks. The October 2022 disclosure of TEE vulnerabilities affecting Secret Network underscored these risks. Zero-Knowledge proofs, employed by Aztec Protocol ($100M Series B from a16z), prove computation correctness without revealing inputs but cannot compute on encrypted data from multiple parties—limiting their applicability for shared state applications like lending pools.

FHE occupies a unique position: mathematically guaranteed privacy with configurable trust thresholds, no hardware dependencies, and the crucial ability to process encrypted data from multiple sources. This enables use cases impossible with other approaches—confidential AMMs computing over encrypted reserves from liquidity providers, or lending protocols managing encrypted collateral positions.

Within FHE specifically, Zama operates as the infrastructure layer while others build chains on top. Fhenix ($22M raised) builds an optimistic rollup L2 using Zama's TFHE-rs via partnership, having deployed CoFHE coprocessor on Arbitrum as the first practical FHE implementation. Inco Network ($4.5M raised) provides confidentiality-as-a-service for existing chains using Zama's fhEVM, offering both TEE-based fast processing and FHE+MPC secure computation. Both projects depend on Zama's core technology—meaning Zama captures value regardless of which FHE chain gains dominance. This infrastructure positioning mirrors how OpenZeppelin profits from smart contract adoption without competing with Ethereum directly.


Use cases span DeFi, AI, RWAs, and compliant payments

In DeFi, FHE fundamentally solves MEV (Maximal Extractable Value). Because transaction parameters remain encrypted until block inclusion, front-running and sandwich attacks become mathematically impossible—there is simply no visible mempool data to exploit. The ZamaSwap reference implementation demonstrates encrypted AMM swaps with fully encrypted balances and pool reserves. Beyond MEV protection, confidential lending protocols can maintain encrypted collateral positions and liquidation thresholds, enabling on-chain credit scoring computed over private financial data.

For AI and machine learning, Concrete ML enables privacy-preserving computation across healthcare (encrypted medical diagnosis), finance (fraud detection on encrypted transactions), and biometrics (authentication without revealing identity). The framework supports encrypted LLM fine-tuning—training language models on sensitive data that never leaves encrypted form. As AI agents proliferate across Web3 infrastructure, FHE provides the confidential computation layer ensuring data privacy without sacrificing utility.

Real-World Asset tokenization represents perhaps the largest opportunity. The JP Morgan Kinexys Project EPIC proof-of-concept demonstrated institutional asset tokenization with encrypted bid amounts, hidden investor holdings, and KYC/AML checks on encrypted data—maintaining full regulatory compliance. This addresses the fundamental barrier preventing traditional finance from using public blockchains: the inability to hide trading strategies and positions from competitors. With tokenized RWAs projected as a $100+ trillion addressable market, FHE unlocks institutional participation that private blockchains cannot serve.

Payment and stablecoin privacy completes the picture. The December 2025 mainnet launch included the first confidential stablecoin transfer using cUSDT. Unlike mixing-based approaches (Tornado Cash), FHE enables programmable compliance—developers define access control rules determining who can decrypt what, enabling regulatory-compliant privacy rather than absolute anonymity. Authorized auditors and regulators receive appropriate access without compromising general transaction privacy.


Regulatory landscape creates tailwinds for compliant privacy

The EU's MiCA framework, fully effective since December 30, 2024, creates strong demand for privacy solutions that maintain compliance. The Travel Rule requires crypto asset service providers to share originator and beneficiary data for all transfers, with no de minimis threshold—making privacy-by-default approaches like mixing impractical. FHE's selective disclosure mechanisms align precisely with this requirement: transactions remain encrypted from general observation while authorized parties access necessary information.

In the United States, the July 2025 signing of the GENIUS Act established the first comprehensive federal stablecoin framework, signaling regulatory maturation that favors compliant privacy solutions over regulatory evasion. The Asia-Pacific region continues advancing progressive frameworks, with Hong Kong's stablecoin regulatory regime effective August 2025 and Singapore maintaining leadership in crypto licensing. Across jurisdictions, the pattern favors solutions enabling both privacy and regulatory compliance—precisely Zama's value proposition.

The 2025 enforcement shift from reactive prosecution to proactive frameworks creates opportunity for FHE adoption. Projects building with compliant privacy architectures from inception—rather than retrofitting privacy-first designs for compliance—will find easier paths to institutional adoption and regulatory approval.


Technical and market challenges require careful navigation

Performance remains the primary barrier, though the trajectory is clear. FHE operations currently run approximately 100x slower than plaintext equivalents—acceptable for low-frequency high-value transactions but constraining for high-throughput applications. The scaling roadmap depends on hardware acceleration: GPU migration in 2026, FPGA optimization, and ultimately purpose-built ASICs. The DARPA DPRIVE program funding Intel, Duality, SRI, and Niobium for FHE accelerator development represents significant government investment accelerating this timeline.

Key management introduces its own complexities. The current 13-node MPC committee for threshold decryption requires honest majority assumptions—collusion among threshold nodes could enable "silent attacks" undetectable by other participants. The roadmap targets expansion to 100+ nodes with HSM integration and post-quantum ZK proofs, strengthening these guarantees.

Competition from TEE and ZK alternatives should not be dismissed. Secret Network and Oasis offer production-ready confidential computing with substantially better current performance. Aztec's $100M backing and team that invented PLONK—the dominant ZK-SNARK construction—means formidable competition in privacy-preserving rollups. The TEE performance advantage may persist if hardware security improves faster than FHE acceleration, though hardware trust assumptions create a fundamental ceiling ZK and FHE solutions don't share.


Conclusion: Infrastructure positioning captures value across ecosystem growth

Zama's strategic genius lies in its positioning as infrastructure rather than competing chain. Both Fhenix and Inco—the leading FHE blockchain implementations—build on Zama's TFHE-rs and fhEVM technology, meaning Zama captures licensing revenue regardless of which protocol gains adoption. The dual licensing model ensures open-source developer adoption drives commercial enterprise demand, while the $ZAMA token launching in January 2026 creates on-chain economics aligning operator incentives with network growth.

Three factors will determine Zama's ultimate success: execution on the performance roadmap from 20 TPS today to 100,000+ TPS with ASICs; institutional adoption following the JP Morgan validation; and developer ecosystem growth beyond current 5,000 developers to mainstream Web3 penetration. The regulatory environment has shifted decisively in favor of compliant privacy, and FHE's unique capability for encrypted multi-party computation addresses use cases neither ZK nor TEE can serve.

For Web3 researchers and investors, Zama represents the canonical "picks and shovels" opportunity in blockchain privacy—infrastructure that captures value as the confidential computing layer matures across DeFi, AI, RWAs, and institutional adoption. The $1 billion valuation prices significant execution risk, but successful delivery of the technical roadmap could position Zama as essential infrastructure for the next decade of blockchain development.

Formal Verification of Smart Contracts and AI-Assisted Auditing

· 39 min read
Dora Noda
Software Engineer

Formal Verification in Smart Contract Auditing

Formal verification refers to the use of mathematical and logic-based techniques to prove the correctness and security of smart contracts. In practice, this encompasses a spectrum of methodologies: from property-based fuzz testing and symbolic execution, to rigorous theorem proving and model checking. The goal is to ensure a contract meets its specifications and contains no exploitable bugs across all possible inputs and states. Given the high stakes (billions of dollars are locked in DeFi protocols), formal methods have become increasingly important for Ethereum and other blockchain platforms.

Traditional Approaches: Early formal methods for Ethereum included symbolic execution tools like Oyente and Mythril, and static analyzers like Slither and Securify. Symbolic execution explores program paths with symbolic inputs to detect issues (e.g. reentrancy, integer overflow), while static analysis uses rule-based pattern matching. These tools have had success but also limitations: for example, Oyente suffered many false alarms even on simple contracts, and Slither’s pattern-based detectors can produce several false positives. Moreover, a 2023 study found that over 80% of exploitable contract bugs (especially complex “business logic” bugs) were missed by current tools, underscoring the need for more robust verification techniques.

The Promise and Challenge of Full Verification: In theory, formal verification can prove the absence of bugs by exhaustively checking invariants for all states. Tools like the Certora Prover or the Ethereum Foundation’s KEVM framework aim to mathematically verify smart contracts against a formal specification. For example, Certora’s “automated mathematical auditor” uses a specification language (CVL) to prove or refute user-defined rules. In practice, however, fully proving properties on real-world contracts is often unattainable or very labor-intensive. Code may need to be rewritten into simplified forms for verification, custom specs must be written, loops and complex arithmetic might require manual bounds or abstractions, and SMT solvers frequently time out on complex logic. As Trail of Bits engineers noted, “proving the absence of bugs is typically unattainable” on non-trivial codebases, and achieving it often requires heavy user intervention and expertise. Because of this, formal verification tools have traditionally been used sparingly for critical pieces of code (e.g. verifying a token’s invariant or a consensus algorithm), rather than entire contracts end-to-end.

Foundry’s Fuzz Testing and Invariant Testing

In recent years, property-based testing has emerged as a practical alternative to full formal proofs. Foundry, a popular Ethereum development framework, has built-in support for fuzz testing and invariant testing – techniques that greatly enhance test coverage and can be seen as lightweight formal verification. Foundry’s fuzz testing automatically generates large numbers of inputs to try to violate specified properties, and invariant testing extends this to sequences of state-changing operations:

  • Fuzz Testing: Instead of writing unit tests for specific inputs, the developer specifies properties or invariants that should hold for any input. Foundry then generates hundreds or thousands of random inputs to test the function and checks that the property always holds. This helps catch edge cases that a developer might not manually think to test. For example, a fuzz test might assert that a function’s return value is always non-negative or that a certain post-condition is true regardless of inputs. Foundry’s engine uses intelligent heuristics – it analyzes function signatures and introduces edge-case values (0, max uint, etc.) – to hit corner cases likely to break the property. If an assertion fails, Foundry reports a counterexample input that violates the property.

  • Invariant Testing: Foundry’s invariant testing (also called stateful fuzzing) goes further by exercising multiple function calls and state transitions in sequence. The developer writes invariant functions that should hold true throughout the contract’s lifetime (e.g. total assets = sum of user balances). Foundry then randomly generates sequences of calls (with random inputs) to simulate many possible usage scenarios, periodically checking that the invariant conditions remain true. This can uncover complex bugs that manifest only after a particular sequence of operations. Essentially, invariant testing explores the contract’s state space more thoroughly, ensuring that no sequence of valid transactions can violate the stated properties.

Why Foundry Matters: Foundry has made these advanced testing techniques accessible. The fuzzing and invariant features are native to the developer workflow – no special harness or external tool is needed, and tests are written in Solidity alongside unit tests. Thanks to a Rust-based engine, Foundry can execute thousands of tests quickly (parallelizing them) and provide detailed failure traces for any invariant violation. Developers report that Foundry’s fuzzer is easy to use and highly performant, requiring only minor configuration (e.g. setting the number of iterations or adding assumptions to constrain inputs). A simple example from Foundry’s documentation is a fuzz test for a divide(a,b) function, which uses vm.assume(b != 0) to avoid trivial invalid inputs and then asserts mathematical post-conditions like result * b <= a. By running such a test with thousands of random (a,b) pairs, Foundry might quickly discover edge cases (like overflow boundaries) that would be hard to find by manual reasoning.

Comparisons: Foundry’s approach builds on prior work in the community. Trail of Bits’ Echidna fuzzer was an earlier property-based testing tool for Ethereum. Echidna similarly generates random transactions to find violations of invariants expressed as Solidity functions returning a boolean. It’s known for “intelligent” input generation (incorporating coverage-guided fuzzing) and has been used to find many bugs. In fact, security researchers note that Echidna’s engine is extremely effective – “the Trail of Bits Echidna is the best fuzzer out there due to its intelligent random number selection” – though Foundry’s integrated workflow makes writing tests simpler for developers. In practice, Foundry’s fuzz testing is often regarded as the new “bare minimum” for secure Solidity development, complementing traditional unit tests. It cannot prove the absence of bugs (since it’s randomized and not exhaustive), but it greatly increases confidence by covering a vast range of inputs and state combinations.

Beyond Fuzzing: Formal Proofs and Advanced Tools

While fuzzing and invariants catch many issues, there are cases where stronger formal methods are used. Model checking and theorem proving involve specifying desired properties in a formal logic and using automated provers to check them against the contract code. Certora Prover (recently open-sourced) is a prominent tool in this category. It allows developers to write rules in a domain-specific language (CVL) and then automatically checks the contract for violations of those rules. Certora has been used to verify critical invariants in protocols like MakerDAO and Compound; for instance, it identified the “DAI debt invariant” bug in MakerDAO (a subtle accounting inconsistency) that went unnoticed for four years. Notably, Certora’s engine now supports multiple platforms (EVM, Solana’s VM, and eWASM), and by open-sourcing it in 2025, it made industrial-grade formal verification freely available on Ethereum, Solana, and Stellar. This move recognizes that formal proofs should not be a luxury for only well-funded teams.

Other formal tools include runtime verification approaches (e.g. the KEVM semantics of Ethereum, or the Move Prover for Move-based chains). The Move Prover in particular is built into the Move language (used by Aptos and Sui blockchains). It allows writing formal specifications alongside the code and can automatically prove certain properties with a user experience similar to a linter or type-checker. This tight integration lowers the barrier for developers on those platforms to use formal verification as part of development.

Summary: Today’s smart contract auditing blends these methodologies. Fuzzing and invariant testing (exemplified by Foundry and Echidna) are widely adopted for their ease of use and effectiveness in catching common bugs. Symbolic execution and static analyzers still serve to quickly scan for known vulnerability patterns (with tools like Slither often integrated into CI pipelines). Meanwhile, formal verification tools are maturing and expanding across chains, but they are typically reserved for specific critical properties or used by specialized auditing firms due to their complexity. In practice, many audits combine these approaches: e.g. using fuzzers to find runtime errors, static tools to flag obvious mistakes, and formal spec checks for key invariants like “no token balance exceeds total supply”.

AI-Assisted Auditing with Large Language Models

The advent of large language models (LLMs) like OpenAI’s GPT-3/4 (ChatGPT) and Codex has introduced a new paradigm for smart contract analysis. These models, trained on vast amounts of code and natural language, can reason about program behavior, explain code, and even detect certain vulnerabilities by pattern recognition and “common sense” knowledge. The question is: can AI augment or even automate smart contract auditing?

LLM-Based Vulnerability Analysis Tools

Several research efforts and prototype tools have emerged that apply LLMs to smart contract auditing:

  • OpenAI Codex / ChatGPT (general models): Early experiments simply involved prompting GPT-3 or Codex with contract code and asking for potential bugs. Developers found that ChatGPT could identify some well-known vulnerability patterns and even suggest fixes. However, responses were hit-or-miss and not reliably comprehensive. A recent academic evaluation noted that naive prompting of ChatGPT for vulnerability detection “did not yield significantly better outcomes compared to random prediction” on a benchmark dataset – essentially, if the model is not guided properly, it may hallucinate issues that aren’t there or miss real vulnerabilities. This highlighted the need for carefully engineered prompts or fine-tuning to get useful results.

  • AuditGPT (2024): This is an academic tool that leveraged ChatGPT (GPT-3.5/4) specifically to check ERC standard compliance in Ethereum contracts. The researchers observed that many ERC20/ERC721 token contracts violate subtle rules of the standard (which can lead to security or compatibility issues). AuditGPT works by breaking down the audit into small tasks and designing specialized prompts for each rule type. The result was impressive: in tests on real contracts, AuditGPT “successfully pinpoints 418 ERC rule violations and only reports 18 false positives”, demonstrating high accuracy. In fact, the paper claims AuditGPT outperformed a professional auditing service in finding ERC compliance bugs, at lower cost. This suggests that for well-defined, narrow domains (like enforcing a list of standard rules), an LLM with good prompts can be remarkably effective and precise.

  • LLM-SmartAudit (2024): Another research project, LLM-SmartAudit, takes a broader approach by using a multi-agent conversation system to audit Solidity code. It sets up multiple specialized GPT-3.5/GPT-4 agents (e.g. one “Auditor” focusing on correctness, one “Attacker” thinking of how to exploit) that talk to each other to analyze a contract. In their evaluation, LLM-SmartAudit was able to detect a wide range of vulnerabilities. Notably, in a comparative test against conventional tools, the GPT-3.5 based system achieved the highest overall recall (74%), outperforming all the traditional static and symbolic analyzers they tested (the next best was Mythril at 54% recall). It also was able to find all of the 10 types of vulnerabilities in their test suite (whereas each traditional tool struggled with some categories). Moreover, by switching to GPT-4 and focusing the analysis (what they call Targeted Analysis mode), the system could detect complex logical flaws that tools like Slither and Mythril completely missed. For instance, on a set of real-world DeFi contracts, the LLM-based approach found hundreds of logic bugs whereas the static tools “demonstrated no effectiveness in detecting” those issues. These results showcase the potential of LLMs to catch subtle bugs that are beyond the pattern-matching scope of traditional analyzers.

  • Prometheus (PromFuzz) (2023): A hybrid approach is to use LLMs to guide other techniques. One example is PromFuzz, which uses a GPT-based “auditor agent” to identify suspect areas in the code, then automatically generates invariant checkers and feeds them into a fuzzing engine. Essentially, the LLM does a first-pass analysis (both from a benign and attacker perspective) to focus the fuzz testing on likely vulnerabilities. In evaluations, this approach achieved very high bug-finding rates – e.g. over 86% recall with zero false positives in certain benchmark sets – significantly outperforming standalone fuzzers or previous tools. This is a promising direction: using AI to orchestrate and enhance classical techniques like fuzzing, combining the strengths of both.

  • Other AI Tools: The community has seen various other AI-assisted auditing concepts. For example, Trail of Bits’ “Toucan” project integrated OpenAI Codex into their audit workflow (more on its findings below). Some security startups are also advertising AI auditors (e.g. “ChainGPT Audit” services), typically wrapping GPT-4 with custom prompts to review contracts. Open-source enthusiasts have created ChatGPT-based audit bots on forums that will take a Solidity snippet and output potential issues. While many of these are experimental, the general trend is clear: AI is being explored at every level of the security review process, from automated code explanation and documentation generation to vulnerability detection and even suggesting fixes.

Capabilities and Limitations of LLM Auditors

LLMs have demonstrated notable capabilities in smart contract auditing:

  • Broad Knowledge: An LLM like GPT-4 has been trained on countless codes and vulnerabilities. It “knows” about common security pitfalls (reentrancy, integer overflow, etc.) and even some historical exploits. This allows it to recognize patterns that might indicate a bug, and to recall best practices from documentation. For example, it might remember that ERC-20 transferFrom should check allowances (and flag the absence of such a check as a violation). Unlike a static tool that only flags known patterns, an LLM can apply reasoning to novel code and infer problems that weren’t explicitly coded into it by a tool developer.

  • Natural Explanations: AI auditors can provide human-readable explanations of potential issues. This is extremely valuable for developer experience. Traditional tools often output cryptic warnings that require expertise to interpret, whereas a GPT-based tool can generate a paragraph explaining the bug in plain English and even suggest a remediation. AuditGPT, for instance, not only flagged ERC rule violations but described why the code violated the rule and what the implications were. This helps in onboarding less-experienced developers to security concepts.

  • Flexibility: With prompt engineering, LLMs can be directed to focus on different aspects or follow custom security policies. They are not limited by a fixed set of rules – if you can describe a property or pattern in words, the LLM can attempt to check it. This makes them attractive for auditing new protocols that might have unique invariants or logic (where writing a custom static analysis from scratch would be infeasible).

However, significant challenges and reliability issues have been observed:

  • Reasoning Limitations: Current LLMs sometimes struggle with the complex logical reasoning required for security analysis. Trail of Bits reported that “the models are not able to reason well about certain higher-level concepts, such as ownership of contracts, re-entrancy, and inter-function relationships”. For example, GPT-3.5/4 might understand what reentrancy is in theory (and even explain it), but it may fail to recognize a reentrancy vulnerability if it requires understanding a cross-function sequence of calls and state changes. The model might also miss vulnerabilities that involve multi-contract interactions or time-dependent logic, because those go beyond the scope of a single code snippet analysis.

  • False Positives and Hallucinations: A major concern is that LLMs can produce confident-sounding yet incorrect conclusions. In auditing, a “hallucination” might be flagging a vulnerability that isn’t actually there, or misinterpreting the code. Trail of Bits’ experiment with Codex (GPT) found that as they scaled to larger real-world contracts, “the false positive and hallucination rates [became] too high,” to the point that it would slow down auditors with too many spurious reports. This unpredictability is problematic – a tool that cries wolf too often will not be trusted by developers. AuditGPT’s success in keeping false positives low (only 18 out of hundreds of findings) is encouraging, but that was in a constrained domain. In general-purpose use, careful prompt design and maybe human review loops are needed to filter AI findings.

  • Context Limitations: LLMs have a context window limitation, meaning they can only “see” a certain amount of code at once. Complex contracts often span multiple files and thousands of lines. If an AI can’t ingest the whole codebase, it might miss important connections. Techniques like code slicing (feeding the contract in chunks) are used, but they risk losing the broader picture. The LLM-SmartAudit team noted that with GPT-3.5’s 4k token limit, they could not analyze some large real-world contracts until they switched to GPT-4 with a larger context. Even then, dividing analysis into parts can cause it to overlook bugs that manifest only when considering the system as a whole. This makes AI analysis of entire protocols (with multiple interacting contracts) a real challenge as of now.

  • Integration and Tooling: There is a lack of robust developer tooling around AI auditors. Running an LLM analysis is not as straightforward as running a linter. It involves API calls to a model, handling prompt engineering, rate limits, and parsing the AI’s responses. “The software ecosystem around integrating LLMs with traditional software is too crude and everything is cumbersome”, as one auditing team put it. There are virtually no off-the-shelf frameworks for continuously deploying an AI auditor in CI pipelines while managing its uncertainties. This is slowly improving (some projects are exploring CI bots that use GPT-4 for code review), but it’s early. Moreover, debugging why an AI gave a certain result is difficult – unlike deterministic tools, you can’t easily trace the logic that led the model to flag or miss something.

  • Cost and Performance: Using large models like GPT-4 is expensive and can be slow. If you want to incorporate an AI audit into a CI/CD pipeline, it might add several minutes per contract and incur significant API costs for large code. Fine-tuned models or open-source LLMs could mitigate cost, but they tend to be less capable than GPT-4. There’s ongoing research into smaller, specialized models for code security, but at present the top results have come from the big proprietary models.

In summary, LLM-assisted auditing is promising but not a silver bullet. We are seeing hybrid approaches where AI helps generate tests or analyze specific aspects, rather than doing the entire audit end-to-end. For instance, an AI might suggest potential invariants or risky areas, which a human or another tool then investigates. As one security engineer remarked, the technology is not yet ready to replace human auditors for critical tasks, given the reasoning gaps and integration hurdles. However, it can already be a useful assistant – “something imperfect may be much better than nothing” in cases where traditional tools fall short.

Accuracy and Reliability of Different Toolchains

It is instructive to compare the accuracy, coverage, and reliability of the various auditing approaches discussed. Below is a summary of findings from research and industry evaluations:

  • Static Analysis Tools: Static analyzers like Slither are valued for quick feedback and ease of use. They typically have high precision but moderate recall – meaning that most of the issues they report are true problems (few false positives by design), but they only detect certain classes of vulnerabilities. A 2024 benchmarking study (LLM-SmartAudit’s RQ1) found Slither caught about 46% of the known vulnerabilities in a test suite. Mythril (symbolic execution) did slightly better at 54% recall. Each tool had strengths in particular bug types (e.g. Slither is very good at spotting arithmetic issues or usage of tx.origin, while symbolic tools excel at finding simple reentrancy scenarios), but none had comprehensive coverage. False positives for mature tools like Slither are relatively low – its developers advertise “minimal false alarms and speedy execution (under 1s per contract)”, making it suitable for CI use. Nonetheless, static tools can misfire if code uses complex patterns; they might flag edge cases that are actually handled or miss deep logic bugs that don’t match any known anti-pattern.

  • Fuzzing and Property Testing: Fuzzers like Foundry’s fuzz/invariant tests or Trail of Bits’ Echidna have proven very effective at finding runtime errors and invariant violations. These tools tend to have near zero false positives in the sense that if a bug is reported (an assertion failed), it’s a real counterexample execution. The trade-off is in false negatives – if a bug doesn’t manifest within the tested input space or number of runs, it can slip by. Coverage depends on how well the fuzzer explores the state space. With enough time and good heuristics, fuzzers have found many high-severity bugs that static analysis missed. For example, Echidna was able to quickly reproduce the MakerDAO and Compound bugs that took formal verification efforts to find. However, fuzzing is not guaranteed to find every logic mistake. As contracts get more complex, fuzzers might require writing additional invariants or using smarter search strategies to hit deeper states. In terms of metrics, fuzzing doesn’t have a single “recall” number, but anecdotal evidence shows that important invariants can usually be broken by a fuzzer if they are breakable. The reliability is high for what it finds (no manual triage needed for false reports), but one must remember a passed fuzz test is not a proof of correctness – just an increase in confidence.

  • Formal Verification Tools: When applicable, formal verification provides the highest assurance – a successful proof means 100% of states satisfy the property. In terms of accuracy, it’s effectively perfect (sound and complete) for the properties it can prove. The biggest issue here is not the accuracy of results but the difficulty of use and narrow scope. Formal tools can also have false negatives in practice: they might simply be unable to prove a true property due to complexity (yielding no result or a timeout, which isn’t a false positive, but it means we fail to verify something that is actually safe). They can also have specification errors, where the tool “proves” something that wasn’t the property you intended (user error). In real audits, formal methods have caught some critical bugs (Certora’s successes include catching a subtle SushiSwap bug and a PRBMath library issue before deployment). But their track record is limited by how rarely they are comprehensively applied – as Trail of Bits noted, it was “difficult to find public issues discovered solely through formal verification, in contrast with the many bugs found by fuzzing”. So, while formal verification is extremely reliable when it succeeds, its impact on overall toolchain coverage is constrained by the effort and expertise required.

  • LLM-Based Analysis: The accuracy of AI auditors is currently a moving target, as new techniques (and newer models) are rapidly pushing the envelope. We can glean a few data points:

    • The AuditGPT system, focused on ERC rules, achieved very high precision (≈96% by false positive count) and found hundreds of issues that static tools or humans overlooked. But this was in a narrow domain with structured prompts. We should not generalize that ChatGPT will be 96% accurate on arbitrary vulnerability hunting – outside of a controlled setup, its performance is lower.
    • In broader vulnerability detection, LLM-SmartAudit (GPT-3.5) showed ~74% recall on a benchmark with moderate false positive rate, which is better than any single traditional tool. When upgraded to GPT-4 with specialized prompting (TA mode), it significantly improved – for example, on a set of 1,400 real-world vulnerabilities, the GPT-4 agent found ~48% of the specific issues and ~47% of the complex logic issues, whereas Slither/Mythril/Conkas each found ~0% (none) of those particular complex issues. This demonstrates that LLMs can dramatically expand coverage to types of bugs that static analysis completely misses. On the flip side, the LLM did not find over half of the issues (so it also has substantial false negatives), and it’s not clear how many false positives were among the ones it reported – the study focused on recall more than precision.
    • Trail of Bits’ Codex/GPT-4 “Toucan” experiment is illustrative of reliability problems. Initially, on small examples, Codex could identify known vulnerabilities (ownership issues, reentrancy) with careful prompting. But as soon as they tried scaling up, they encountered inconsistent and incorrect results. They reported “the number of failures was high even in average-sized code” and difficult to predict. Ultimately, they concluded that GPT-4 (as of early 2023) was only an incremental improvement and still “missing key features” like reasoning about cross-function flows. The outcome was that the AI did not materially reduce false positives from their static tools, nor did it reliably speed up their audit workflow. In other words, the current reliability of a general LLM as an auditor was deemed insufficient by professional auditors in that trial.

To sum up, each toolchain has different strengths:

  • Static tools: Reliable for quick detection of known issues; low noise, but limited bug types (medium recall ~40–50% on benchmarks).
  • Fuzz/invariant testing: Very high precision (almost no false alerts) and strong at finding functional and state-dependent bugs; coverage can be broad but not guaranteed (no simple metric, depends on time and invariant quality). Often finds the same bugs formal proofs would if given enough guidance.
  • Formal verification: Gold standard for absolute correctness on specific properties; essentially 100% recall/precision for those properties if successfully applied. But practically limited to narrow problems and requires significant effort (not a one-button solution for full audits yet).
  • AI (LLM) analysis: Potentially high coverage – can find bugs across categories including those missed by other tools – but variable precision. With specialized setups, it can be both precise and broad (as AuditGPT showed for ERC checks). Without careful constraints, it may cast a wide net and require human vetting of results. The “average” accuracy of ChatGPT on vulnerability detection is not spectacular (close to guessing, in one study), but the best-case engineered systems using LLMs are pushing performance beyond traditional tools. There’s active research to make AI outputs more trustworthy (e.g. by having multiple agents cross-verify, or combining LLM with static reasoning to double-check AI conclusions).

It’s worth noting that combining approaches yields the best results. For example, one might run Slither (to catch low-hanging fruit with no noise), then use Foundry/Echidna to fuzz deeper behaviors, and perhaps use an LLM-based tool to scan for any patterns or invariants not considered. Each will cover different blind spots of the others.

Real-World Adoption Challenges and Limitations

In practice, adopting formal verification or AI tools in a development workflow comes with pragmatic challenges. Some key issues include:

  • Developer Experience and Expertise: Traditional formal verification has a steep learning curve. Writing formal specs (in CVL, Coq, Move’s specification language, etc.) is more like writing mathematical proofs than writing code. Many developers lack this background, and formal methods experts are in short supply. By contrast, fuzzing with Foundry or writing invariants in Solidity is much more accessible – it feels like writing additional tests. This is a big reason fuzz testing has seen faster uptake than formal proofs in the Ethereum community. The Trail of Bits team explicitly noted that fuzzing “produces similar results but requires significantly less skill and time” than formal methods in many cases. Thus, even though formal verification can catch different bugs, many teams opt for the easier approach that gets good enough results with lower effort.

  • Integration into Development Workflow: For a tool to be widely adopted, it needs to fit into CI/CD pipelines and everyday coding. Tools like Slither shine here – it “easily integrates into CI/CD setups, streamlining automation and aiding developers.” A developer can add Slither or Mythril to a GitHub Action and have it fail the build if issues are found. Foundry’s fuzz tests can be run as part of forge test every time. Invariant tests can even be run continuously in the cloud with tools like CloudExec, and any failure can automatically be converted into a unit test using fuzz-utils. These integrations mean developers get quick feedback and can iterate. In contrast, something like the Certora Prover historically was run as a separate process (or by an external auditing team) and might take hours to produce a result – not something you’d run on every commit. AI-based tools face integration hurdles too: calling an API and handling its output deterministically in CI is non-trivial. There’s also the matter of security and privacy – many organizations are uneasy about sending proprietary contract code to a third-party AI service for analysis. Self-hosted LLM solutions are not yet as powerful as the big cloud APIs, so this is a sticking point for CI adoption of AI auditors.

  • False Positives and Noise: A tool that reports many false positives or low-severity findings can reduce developer trust. Static analyzers have struggled with this in the past – e.g., early versions of some tools would flood users with warnings, many of which were irrelevant. The balance between signal and noise is crucial. Slither is praised for minimal false alarms, whereas a tool like Securify (in its research form) often produced many warnings that required manual filtering. LLMs, as discussed, can generate noise if not properly targeted. This is why currently AI suggestions are usually taken as advisory, not absolute. Teams are more likely to adopt a noisy tool if it’s run by a separate security team or in a audit context, but for day-to-day use, developers prefer tools that give clear, actionable results with low noise. The ideal is to “fail the build” only on definite bugs, not on hypotheticals. Achieving that reliability is an ongoing challenge, especially for AI tools.

  • Scalability and Performance: Formal verification can be computationally intensive. As noted, solvers might time out on complex contracts. This makes it hard to scale to large systems. If verifying one property takes hours, you won’t be checking dozens of properties on each code change. Fuzz testing also faces scalability limits – exploring a huge state space or a contract with many methods combinatorially can be slow (though in practice fuzzers can run in the background or overnight to deepen their search). AI models have fixed context sizes and increasing a model’s capacity is expensive. While GPT-4’s 128k-token context is a boon, feeding it hundreds of kilobytes of contract code is costly and still not enough for very large projects (imagine a complex protocol with many contracts – you might exceed that). There’s also multi-chain scaling: if your project involves interactions between contracts on different chains (Ethereum ↔ another chain), verifying or analyzing that cross-chain logic is even more complex and likely beyond current tooling.

  • Human Oversight and Verification: At the end of the day, most teams still rely on expert human auditors for final sign-off. Automated tools are aids, not replacements. One limitation of all these tools is that they operate within the bounds of known properties or patterns. They might miss a totally novel type of bug or a subtle economic flaw in a DeFi protocol’s design. Human auditors use intuition and experience to consider how an attacker might approach the system, sometimes in ways no tool is explicitly programmed to do. There have been cases where contracts passed all automated checks but a human later found a vulnerability in the business logic or a creative attack vector. Thus, a challenge is avoiding a false sense of security – developers must interpret the output of formal tools and AI correctly and not assume “no issues found” means the code is 100% safe.

  • Maintaining Specifications and Tests: For formal verification in particular, one practical issue is spec drift. The spec (invariants, rules, etc.) might become outdated as the code evolves. Ensuring the code and its formal spec remain in sync is a non-trivial management task. If developers are not vigilant, the proofs might “pass” simply because they’re proving something no longer relevant to the code’s actual requirements. Similarly, invariant tests must be updated as the system’s expected behavior changes. This requires a culture of invariant-driven development that not all teams have (though there is a push to encourage it).

In summary, the main limitations are people and process, rather than the raw capability of the technology. Formal and AI-assisted methods can greatly improve security, but they must be deployed in a way that fits developers’ workflows and skill sets. Encouragingly, trends like invariant-driven development (writing down critical invariants from day one) are gaining traction, and tooling is slowly improving to make advanced analyses more plug-and-play. The open-sourcing of major tools (e.g. Certora Prover) and the integration of fuzzing into popular frameworks (Foundry) are lowering barriers. Over time, we expect the gap between what “an average developer” can do and what “a PhD researcher” can do will narrow, in terms of using these powerful verification techniques.

Open-Source vs Commercial Auditing Tools

The landscape of smart contract security tools includes both open-source projects and commercial services. Each has its role, and often they complement each other:

  • Open-Source Tools: The majority of widely-used auditing tools in the Ethereum ecosystem are open-source. This includes Slither (static analyzer), Mythril (symbolic execution), Echidna (fuzzer), Manticore (symbolic/concolic execution), Securify (analyzer from ETH Zurich), Solhint/Ethlint (linters), and of course Foundry itself. Open-source tools are favored for a few reasons: (1) Transparency – developers can see how the tool works and trust that nothing malicious or hidden is occurring (important in an open ecosystem). (2) Community Contribution – rules and features get added by a broad community (Slither, for example, has many community-contributed detectors). (3) Cost – they are free to use, which is important for the many small projects/startups in Web3. (4) Integration – open tools can usually be run locally or in CI without legal hurdles, and often they can be customized or scripted for project-specific needs.

    Open-source tools have rapidly evolved. For instance, Slither’s support for new Solidity versions and patterns is continuously updated by Trail of Bits. Mythril, developed by ConsenSys, can analyze not just Ethereum but any EVM-compatible chain by working on bytecode – showing how open tools can be repurposed across chains easily. The downside of open tools is that they often come with “use at your own risk” – no official support or guarantees. They might not scale or have the polish of a commercial product’s UI. But in blockchain, even many companies use open-source as their core tools internally (sometimes with slight custom modifications).

  • Commercial Services and Tools: A few companies have offered security analysis as a product. Examples include MythX (a cloud-based scanning API by ConsenSys Diligence), Certora (which offered its prover as a service before open-sourcing it), CertiK and SlowMist (firms that have internal scanners and AI that they use in audits or offer via dashboards), and some newer entrants like Securify from ChainSecurity (the company was acquired and its tool used internally) or SolidityScan (a cloud scanner that outputs an audit report). Commercial tools often aim to provide a more user-friendly experience or integrated service. For example, MythX integrated with IDE extensions and CI plugins so that developers could send their contracts to MythX and get back a detailed report, including a vulnerability score and details to fix issues. These services typically run a combination of analysis techniques under the hood (pattern matching, symbolic execution, etc.) tuned to minimize false positives.

    The value proposition of commercial tools is usually convenience and support. They may maintain a continuously updated knowledge base of vulnerabilities and provide customer support in interpreting results. They might also handle heavy computation in the cloud (so you don’t need to run a solver locally). However, cost is a factor – many projects opt not to pay for these services, given the availability of free alternatives. Additionally, in the spirit of decentralization, some developers are hesitant to rely on closed-source services for security (the “verify, don’t trust” ethos). The open-sourcing of the Certora Prover in 2025 is a notable event: it turned what was a high-end commercial tool into a community resource. This move is expected to accelerate adoption, as now anyone can attempt to formally verify their contracts without a paid license, and the code openness will allow researchers to improve the tool or adapt it to other chains.

  • Human Audit Services: It’s worth mentioning that beyond software tools, many audits are done by professional firms (Trail of Bits, OpenZeppelin, Certik, PeckShield, etc.). These firms use a mix of the above tools (mostly open-source) and proprietary scripts. The outputs of these efforts are often kept private or only summarized in audit reports. There isn’t exactly an “open vs commercial” dichotomy here, since even commercial audit firms rely heavily on open-source tools. The differentiation is more in expertise and manual effort. That said, some firms are developing proprietary AI-assisted auditing platforms to give themselves an edge (for example, there were reports of “ChainGPT” being used for internal audits, or CertiK developing an AI called Skynet for on-chain monitoring). Those are not public tools per se, so their accuracy and methods are not widely documented.

In practice, a common pattern is open-source first, commercial optional. Teams will use open tools during development and testing (because they can integrate them easily and run as often as needed). Then, they might use a commercial service or two as an additional check before deployment – for instance, running a MythX scan to get a “second opinion” or hiring a firm that uses advanced tools like Certora to formally verify a critical component. With the lines blurring (Certora open source, MythX results often overlapping with open tools), the distinction is less about capability and more about support and convenience.

One notable cross-over is Certora’s multi-chain support – by supporting Solana and Stellar formally, they address platforms where fewer open alternatives exist (e.g. Ethereum has many open tools, Solana had almost none until recently). As security tools expand to new ecosystems, we may see more commercial offerings fill gaps, at least until open-source catches up.

Finally, it’s worth noting that open vs commercial is not adversarial here; they often learn from each other. For example, techniques pioneered in academic/commercial tools (like abstract interpretation used in Securify) eventually find their way into open tools, and conversely, data from open tool usage can guide commercial tool improvements. The end result sought by both sides is better security for the entire ecosystem.

Cross-Chain Applicability

While Ethereum has been the focal point for most of these tools (given its dominance in smart contract activity), the concepts of formal verification and AI-assisted auditing are applicable to other blockchain platforms as well. Here’s how they translate:

  • EVM-Compatible Chains: Blockchains like BSC, Polygon, Avalanche C-Chain, etc., which use the Ethereum Virtual Machine, can directly use all the same tools. A fuzz test or static analysis doesn’t care if your contract will deploy on Ethereum mainnet or on Polygon – the bytecode and source language (Solidity/Vyper) are the same. Indeed, Mythril and Slither can analyze contracts from any EVM chain by pulling the bytecode from an address (Mythril just needs an RPC endpoint). Many DeFi projects on these chains run Slither and Echidna just as an Ethereum project would. The audits of protocols on BSC or Avalanche typically use the identical toolkit as Ethereum audits. So, cross-chain in the EVM context mostly means reusing Ethereum’s toolchain.

  • Solana: Solana’s smart contracts are written in Rust (usually via the Anchor framework) and run on the BPF virtual machine. This is a very different environment from Ethereum, so Ethereum-specific tools don’t work out of the box. However, the same principles apply. For instance, one can do fuzz testing on Solana programs using Rust’s native fuzzing libraries or tools like cargo-fuzz. Formal verification on Solana had been nearly non-existent until recently. The collaboration between Certora and Solana engineers yielded an in-house verification tool for Solana programs that can prove Rust/Anchor contracts against specifications. As noted, Certora extended their prover to Solana’s VM, meaning developers can write rules about Solana program behavior and check them just like they would for Solidity. This is significant because Solana’s move-fast approach meant many contracts were launched without the kind of rigorous testing seen in Ethereum; formal tools could improve that. AI auditing for Solana is also plausible – an LLM that understands Rust could be prompted to check a Solana program for vulnerabilities (like missing ownership checks or arithmetic errors). It might require fine-tuning on Solana-specific patterns, but given Rust’s popularity, GPT-4 is fairly proficient at reading Rust code. We may soon see “ChatGPT for Anchor” style tools emerging as well.

  • Polkadot and Substrate-based Chains: Polkadot’s smart contracts can be written in Rust (compiled to WebAssembly) using the ink! framework, or you can run an EVM pallet (like Moonbeam does) which again allows Solidity. In the ink!/Wasm case, the verification tools are still nascent. One could attempt to formally verify a Wasm contract’s properties using general Wasm verification frameworks (for example, Microsoft’s Project Verona or others have looked at Wasm safety). Certora’s open-source prover also mentions support for Stellar’s WASM smart contracts, which are similar in concept to any Wasm-based chain. So it’s likely applicable to Polkadot’s Wasm contracts too. Fuzz testing ink! contracts can be done by writing Rust tests (property tests in Rust can serve a similar role). AI auditing of ink! contracts would entail analyzing Rust code as well, which again an LLM can do with the right context (though it might not know about the specific ink! macros or traits without some hints).

    Additionally, Polkadot is exploring the Move language for parallel smart contract development (as hinted in some community forums). If Move becomes used on Polkadot parachains, the Move Prover comes with it, bringing formal verification capabilities by design. Move’s emphasis on safety (resource oriented programming) and built-in prover shows a cross-chain propagation of formal methods from the start.

  • Other Blockchains: Platforms like Tezos (Michelson smart contracts) and Algorand (TEAL programs) each have had formal verification efforts. Tezos, for example, has a tool called Mi-Cho-Coq that provides a formal semantics of Michelson and allows proving properties. These are more on the academic side but show that any blockchain with a well-defined contract semantics can be subjected to formal verification. AI auditing could, in principle, be applied to any programming language – it’s a matter of training or prompting the LLM appropriately. For less common languages, an LLM might need some fine-tuning to be effective, as it may not have been pretrained on enough examples.

  • Cross-Chain Interactions: A newer challenge is verifying interactions across chains (like bridges or inter-chain messaging). Formal verification here might involve modeling multiple chains’ state and the communication protocol. This is very complex and currently beyond most tools, though specific bridge protocols have been manually analyzed for security. AI might help in reviewing cross-chain code (for instance, reviewing a Solidity contract that interacts with an IBC protocol on Cosmos), but there’s no out-of-the-box solution yet.

In essence, Ethereum’s tooling has paved the way for other chains. Many of the open-source tools are being adapted: e.g., there are efforts to create a Slither-like static analyzer for Solana (Rust), and the concepts of invariant testing are language-agnostic (property-based testing exists in many languages). The open-sourcing of powerful engines (like Certora’s for multiple VMs) is particularly promising for cross-chain security – the same platform could be used to verify a Solidity contract, a Solana program, and a Move contract, provided each has a proper specification written. This encourages a more uniform security posture across the industry.

It’s also worth noting that AI-assisted auditing will benefit all chains, since the model can be taught about vulnerabilities in any context. As long as the AI is provided with the right information (language specs, known bug patterns in that ecosystem), it can apply similar reasoning. For example, ChatGPT could be asked to audit a Solidity contract or a Move module with the appropriate prompt, and it will do both – it might even catch something like “this Move module might violate resource conservation” if it has that concept. The limitation is whether the AI was exposed to enough training data for that chain. Ethereum, being most popular, has likely biased the models to be best at Solidity. As other chains grow, future LLMs or fine-tuned derivatives could cover those as well.

Conclusion

智能合约形式化验证与 AI 辅助审计 is a rapidly evolving field. We now have a rich toolkit: from deterministic static analyzers and fuzzing frameworks that improve code reliability, to cutting-edge AI that can reason about code in human-like ways. Formal verification, once a niche academic pursuit, is becoming more practical through better tools and integration. AI, despite its current limitations, has shown glimpses of game-changing capabilities in automating security analysis.

There is no one-size-fits-all solution yet – real-world auditing combines multiple techniques to achieve defense in depth. Foundry’s fuzz and invariant testing are already raising the bar for what is expected before deployment (catching many errors that would slip past basic tests). AI-assisted auditing, when used carefully, can act as a force multiplier for auditors, highlighting issues and verifying compliance at a scale and speed that manual review alone cannot match. However, human expertise remains crucial to drive these tools, interpret their findings, and define the right properties to check.

Moving forward, we can expect greater convergence of these approaches. For example, AI might help write formal specifications or suggest invariants (“AI, what security properties should hold for this DeFi contract?”). Fuzzing tools might incorporate AI to guide input generation more intelligently (as PromFuzz does). Formal verification engines might use machine learning to decide which lemmas or heuristics to apply. All of this will contribute to more secure smart contracts across not just Ethereum, but all blockchain platforms. The ultimate vision is a future where critical smart contracts can be deployed with high confidence in their correctness – a goal that will likely be achieved by the synergistic use of formal methods and AI assistance, rather than either alone.

протSources:

  • Foundry fuzzing and invariant testing overview
  • Trail of Bits on fuzzing vs formal verification
  • Trail of Bits on formal verification limitations
  • Patrick Collins on fuzz/invariant vs formal methods
  • Trail of Bits on invariants in audits
  • Medium (BuildBear) on Slither and Mythril usage
  • AuditGPT results (ERC compliance)
  • Trail of Bits on LLM (Codex/GPT-4) limitations
  • LLM-SmartAudit performance vs traditional tools
  • “Detection Made Easy” study on ChatGPT accuracy
  • PromFuzz (LLM+fuzz) performance
  • Certora open-source announcement (multi-chain)
  • Move Prover description (Aptos)
  • Static analysis background (Smart contract security literature)

EIP-7702 After Pectra: A Practical Playbook for Ethereum App Developers

· 9 min read
Dora Noda
Software Engineer

On May 7, 2025, Ethereum’s Pectra upgrade (Prague + Electra) hit mainnet. Among its most developer-visible changes is EIP-7702, which lets an externally owned account (EOA) "mount" smart-contract logic—without migrating funds or changing addresses. If you build wallets, dapps, or relayers, this unlocks a simpler path to smart-account UX.

Below is a concise, implementation-first guide: what actually shipped, how 7702 works, when to choose it over pure ERC-4337, and a cut-and-paste scaffold you can adapt today.


What Actually Shipped

  • EIP-7702 is in Pectra’s final scope. The meta-EIP for the Pectra hard fork officially lists 7702 among the included changes.
  • Activation details: Pectra activated on mainnet at epoch 364032 on May 7, 2025, following successful activations on all major testnets.
  • Toolchain note: Solidity v0.8.30 updated its default EVM target to prague for Pectra compatibility. You'll need to upgrade your compilers and CI pipelines, especially if you pin specific versions.

EIP-7702—How It Works (Nuts & Bolts)

EIP-7702 introduces a new transaction type and a mechanism for an EOA to delegate its execution logic to a smart contract.

  • New Transaction Type (0x04): A Type-4 transaction includes a new field called an authorization_list. This list contains one or more authorization tuples—(chain_id, address, nonce, y_parity, r, s)—each signed by the EOA's private key. When this transaction is processed, the protocol writes a delegation indicator to the EOA’s code field: 0xef0100 || address. From that point forward, any calls to the EOA are proxied to the specified address (the implementation), but they execute within the EOA’s storage and balance context. This delegation remains active until it's explicitly changed.
  • Chain Scope: An authorization can be chain-specific by providing a chain_id, or it can apply to all chains if chain_id is set to 0. This allows you to deploy the same implementation contract across multiple networks without requiring users to sign a new authorization for each one.
  • Revocation: To revert an EOA back to its original, non-programmable behavior, you simply send another 7702 transaction where the implementation address is set to the zero address. This clears the delegation indicator.
  • Self-Sponsored vs. Relayed: An EOA can submit the Type-4 transaction itself, or a third-party relayer can submit it on the EOA's behalf. The latter is common for creating a gasless user experience. Nonce handling differs slightly depending on the method, so it's important to use libraries that correctly manage this distinction.

Security Model Shift: Because the original EOA private key still exists, it can always override any smart contract rules (like social recovery or spending limits) by submitting a new 7702 transaction to change the delegation. This is a fundamental shift. Contracts that rely on tx.origin to verify that a call is from an EOA must be re-audited, as 7702 can break these assumptions. Audit your flows accordingly.


7702 or ERC-4337? (And When to Combine)

Both EIP-7702 and ERC-4337 enable account abstraction, but they serve different needs.

  • Choose EIP-7702 when…
    • You want to provide instant smart-account UX for existing EOAs without forcing users to migrate funds or change addresses.
    • You need consistent addresses across chains that can be progressively upgraded with new features.
    • You want to stage your transition to account abstraction, starting with simple features and adding complexity over time.
  • Choose pure ERC-4337 when…
    • Your product requires full programmability and complex policy engines (e.g., multi-sig, advanced recovery) from day one.
    • You are building for new users who don't have existing EOAs, making new smart-account addresses and the associated setup acceptable.
  • Combine them: The most powerful pattern is to use both. An EOA can use a 7702 transaction to designate an ERC-4337 wallet implementation as its logic. This makes the EOA behave like a 4337 account, allowing it to be bundled, sponsored by paymasters, and processed by the existing 4337 infrastructure—all without the user needing a new address. This is a forward-compatible path explicitly encouraged by the EIP's authors.

Minimal 7702 Scaffold You Can Adapt

Here’s a practical example of an implementation contract and the client-side code to activate it.

1. A Tiny, Auditable Implementation Contract

This contract code will execute in the EOA’s context once designated. Keep it small, auditable, and consider adding an upgrade mechanism.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;

/// @notice Executes calls from the EOA context when designated via EIP-7702.
contract DelegatedAccount {
// Unique storage slot to avoid collisions with other contracts.
bytes32 private constant INIT_SLOT =
0x3fb93b3d3dcd1d1f4b4a1a8db6f4c5d55a1b7f9ac01dfe8e53b1b0f35f0c1a01;

event Initialized(address indexed account);
event Executed(address indexed to, uint256 value, bytes data, bytes result);

modifier onlyEOA() {
// Optional: add checks to restrict who can call certain functions.
_;
}

function initialize() external payable onlyEOA {
// Set a simple one-time init flag in the EOA's storage.
bytes32 slot = INIT_SLOT;
assembly {
if iszero(iszero(sload(slot))) { revert(0, 0) } // Revert if already initialized
sstore(slot, 1)
}
emit Initialized(address(this));
}

function execute(address to, uint256 value, bytes calldata data)
external
payable
onlyEOA
returns (bytes memory result)
{
(bool ok, bytes memory ret) = to.call{value: value}(data);
require(ok, "CALL_FAILED");
emit Executed(to, value, data, ret);
return ret;
}

function executeBatch(address[] calldata to, uint256[] calldata value, bytes[] calldata data)
external
payable
onlyEOA
{
uint256 n = to.length;
require(n == value.length && n == data.length, "LENGTH_MISMATCH");
for (uint256 i = 0; i < n; i++) {
(bool ok, ) = to[i].call{value: value[i]}(data[i]);
require(ok, "CALL_FAILED");
}
}
}

2. Designate the Contract on an EOA (Type-4 tx) with viem

Modern clients like viem have built-in helpers to sign authorizations and send Type-4 transactions. In this example, a relayer account pays the gas to upgrade an eoa.

import { createWalletClient, http, encodeFunctionData } from "viem";
import { sepolia } from "viem/chains";
import { privateKeyToAccount } from "viem/accounts";
import { abi, implementationAddress } from "./DelegatedAccountABI";

// 1. Define the relayer (sponsors gas) and the EOA to be upgraded
const relayer = privateKeyToAccount(process.env.RELAYER_PK as `0x${string}`);
const eoa = privateKeyToAccount(process.env.EOA_PK as `0x${string}`);

const client = createWalletClient({
account: relayer,
chain: sepolia,
transport: http(),
});

// 2. The EOA signs the authorization pointing to the implementation contract
const authorization = await client.signAuthorization({
account: eoa,
contractAddress: implementationAddress,
// If the EOA itself were sending this, you would add: executor: 'self'
});

// 3. The relayer sends a Type-4 transaction to set the EOA's code and call initialize()
const hash = await client.sendTransaction({
to: eoa.address, // The destination is the EOA itself
authorizationList: [authorization], // The new EIP-7702 field
data: encodeFunctionData({ abi, functionName: "initialize" }),
});

// 4. Now, the EOA can be controlled via its new logic without further authorizations
// For example, to execute a transaction:
// await client.sendTransaction({
// to: eoa.address,
// data: encodeFunctionData({ abi, functionName: 'execute', args: [...] })
// });

3. Revoke Delegation (Back to Plain EOA)

To undo the upgrade, have the EOA sign an authorization that designates the zero address as the implementation and send another Type-4 transaction. Afterward, a call to eth_getCode(eoa.address) should return empty bytes.


Integration Patterns That Work in Production

  • Upgrade in Place for Existing Users: In your dapp, detect if the user is on a Pectra-compatible network. If so, display an optional "Upgrade Account" button that triggers the one-time authorization signature. Maintain fallback paths (e.g., classic approve + swap) for users with older wallets.
  • Gasless Onboarding: Use a relayer (either your backend or a service) to sponsor the initial Type-4 transaction. For ongoing gasless transactions, route user operations through an ERC-4337 bundler to leverage existing paymasters and public mempools.
  • Cross-Chain Rollouts: Use a chain_id = 0 authorization to designate the same implementation contract across all chains. You can then enable or disable features on a per-chain basis within your application logic.
  • Observability: Your backend should index Type-4 transactions and parse the authorization_list to track which EOAs have been upgraded. After a transaction, verify the change by calling eth_getCode and confirming the EOA's code now matches the delegation indicator (0xef0100 || implementationAddress).

Threat Model & Gotchas (Don’t Skip This)

  • Delegation is Persistent: Treat changes to an EOA's implementation contract with the same gravity as a standard smart contract upgrade. This requires audits, clear user communication, and ideally, an opt-in flow. Never push new logic to users silently.
  • tx.origin Landmines: Any logic that used msg.sender == tx.origin to ensure a call came directly from an EOA is now potentially vulnerable. This pattern must be replaced with more robust checks, like EIP-712 signatures or explicit allowlists.
  • Nonce Math: When an EOA sponsors its own 7702 transaction (executor: 'self'), its authorization nonce and transaction nonce interact in a specific way. Always use a library that correctly handles this to avoid replay issues.
  • Wallet UX Responsibility: The EIP-7702 specification warns that dapps should not ask users to sign arbitrary designations. It is the wallet's responsibility to vet proposed implementations and ensure they are safe. Design your UX to align with this principle of wallet-mediated security.

When 7702 is a Clear Win

  • DEX Flows: A multi-step approve and swap can be combined into a single click using the executeBatch function.
  • Games & Sessions: Grant session-key-like privileges for a limited time or scope without requiring the user to create and fund a new wallet.
  • Enterprise & Fintech: Enable sponsored transactions and apply custom spending policies while keeping the same corporate address on every chain for accounting and identity.
  • L2 Bridges & Intents: Create smoother meta-transaction flows with a consistent EOA identity across different networks.

These use cases represent the same core benefits promised by ERC-4337, but are now available to every existing EOA with just a single authorization.


Ship Checklist

Protocol

  • Ensure nodes, SDKs, and infrastructure providers support Type-4 transactions and Pectra's "prague" EVM.
  • Update indexers and analytics tools to parse the authorization_list field in new transactions.

Contracts

  • Develop a minimal, audited implementation contract with essential features (e.g., batching, revocation).
  • Thoroughly test revoke and re-designate flows on testnets before deploying to mainnet.

Clients

  • Upgrade client-side libraries (viem, ethers, etc.) and test the signAuthorization and sendTransaction functions.
  • Verify that both self-sponsored and relayed transaction paths handle nonces and replays correctly.

Security

  • Remove all assumptions based on tx.origin from your contracts and replace them with safer alternatives.
  • Implement post-deployment monitoring to detect unexpected code changes at user addresses and alert on suspicious activity.

Bottom line: EIP-7702 provides a low-friction on-ramp to smart-account UX for the millions of EOAs already in use. Start with a tiny, audited implementation, use a relayed path for gasless setup, make revocation clear and easy, and you can deliver 90% of the benefits of full account abstraction—without the pain of address churn and asset migration.