Skip to main content

8 posts tagged with "Security"

Cybersecurity, smart contract audits, and best practices

View all tags

The WaaS Infrastructure Revolution: How Embedded Wallets Are Reshaping Web3 Adoption

· 35 min read
Dora Noda
Software Engineer

Wallet-as-a-Service has emerged as the critical missing infrastructure layer enabling mainstream Web3 adoption. The market is experiencing explosive 30% compound annual growth toward $50 billion by 2033, driven by three converging forces: account abstraction eliminating seed phrases, multi-party computation solving the custody trilemma, and social login patterns bridging Web2 to Web3. With 103 million smart account operations executed in 2024—a 1,140% surge from 2023—and major acquisitions including Stripe's purchase of Privy and Fireblocks' $90 million Dynamic acquisition, the infrastructure landscape has reached an inflection point. WaaS now powers everything from Axie Infinity's play-to-earn economy (serving millions in the Philippines) to NBA Top Shot's $500 million marketplace, while institutional players like Fireblocks secure over $10 trillion in digital asset transfers annually. This research provides actionable intelligence for builders navigating the complex landscape of security models, regulatory frameworks, blockchain support, and emerging innovations reshaping digital asset infrastructure.

Security architecture: MPC and TEE emerge as the gold standard

The technical foundation of modern WaaS revolves around three architectural paradigms, with multi-party computation combined with trusted execution environments representing the current security apex. Fireblocks' MPC-CMP algorithm delivers 8x speed improvements over traditional approaches while distributing key shares across multiple parties—the complete private key never exists at any point during generation, storage, or signing. Turnkey's entirely TEE-based architecture using AWS Nitro Enclaves pushes this further, with five specialized enclave applications written entirely in Rust operating under a zero-trust model where even the database is considered untrusted.

The performance metrics validate this approach. Modern MPC protocols achieve 100-500 millisecond signing latency for 2-of-3 threshold signatures, enabling consumer-grade experiences while maintaining institutional security. Fireblocks processes millions of operations daily, while Turnkey guarantees 99.9% uptime with sub-second transaction signing. This represents a quantum leap from traditional HSM-only approaches, which create single points of failure despite hardware-level protection.

Smart contract wallets via ERC-4337 present a complementary paradigm focused on programmability over distributed key management. The 103 million UserOperations executed in 2024 demonstrate real traction, with 87% utilizing Paymasters to sponsor gas fees—directly addressing the onboarding friction that has plagued Web3. Alchemy deployed 58% of new smart accounts, while Coinbase processed over 30 million UserOps, primarily on Base. The August 2024 peak of 18.4 million monthly operations signals growing mainstream readiness, though the 4.3 million repeat users indicate retention challenges remain.

Each architecture presents distinct trade-offs. MPC wallets deliver universal blockchain support through curve-based signing, appearing as standard single signatures on-chain with minimal gas overhead. Smart contract wallets enable sophisticated features like social recovery, session keys, and batch transactions but incur higher gas costs and require chain-specific implementations. Traditional HSM approaches like Magic's AWS KMS integration provide battle-tested security infrastructure but introduce centralized trust assumptions incompatible with true self-custody requirements.

The security model comparison reveals why enterprises favor MPC-TSS combined with TEE protection. Turnkey's architecture with cryptographic attestation for all enclave code ensures verifiable security properties impossible with traditional cloud deployments. Web3Auth's distributed network approach splits keys across Torus Network nodes plus user devices, achieving non-custodial security through distributed trust rather than hardware isolation. Dynamic's TSS-MPC with flexible threshold configurations allows dynamic adjustment from 2-of-3 to 3-of-5 without address changes, providing operational flexibility enterprises require.

Key recovery mechanisms have evolved beyond seed phrases into sophisticated social recovery and automated backup systems. Safe's RecoveryHub implements smart contract-based guardian recovery with configurable time delays, supporting self-custodial configurations with hardware wallets or institutional third-party recovery through partners like Coincover and Sygnum. Web3Auth's off-chain social recovery avoids gas costs entirely while enabling device share plus guardian share reconstruction. Coinbase's public-verifiable backups use cryptographic proofs ensuring backup integrity before enabling transactions, preventing the catastrophic loss scenarios that plagued early custody solutions.

Security vulnerabilities in the 2024 threat landscape underscore why defense-in-depth approaches are non-negotiable. With 44,077 CVEs disclosed in 2024—a 33% increase from 2023—and average exploitation occurring just 5 days after disclosure, WaaS infrastructure must anticipate constant adversary evolution. Frontend compromise attacks like the BadgerDAO $120 million theft via malicious script injection demonstrate why Turnkey's TEE-based authentication eliminates trust in the web application layer entirely. The WalletConnect fake app stealing $70,000 through Google Play impersonation highlights protocol-level verification requirements, now standard in leading implementations.

Market landscape: Consolidation accelerates as Web2 giants enter

The WaaS provider ecosystem has crystallized around distinct positioning strategies, with Stripe's Privy acquisition and Fireblocks' $90 million Dynamic purchase signaling the maturation phase where strategic buyers consolidate capabilities. The market now segments cleanly between institutional-focused providers emphasizing security and compliance, versus consumer-facing solutions optimizing for seamless onboarding and Web2 integration patterns.

Fireblocks dominates the institutional segment with an $8 billion valuation and over $1 trillion in secured assets annually, serving 500+ institutional customers including banks, exchanges, and hedge funds. The company's acquisition of Dynamic represents vertical integration from custody infrastructure into consumer-facing embedded wallets, creating a full-stack solution spanning enterprise treasury management to retail applications. Fireblocks' MPC-CMP technology secures 130+ million wallets with SOC 2 Type II certification and insurance policies covering assets in storage and transit—critical requirements for regulated financial institutions.

Privy's trajectory from $40 million in funding to Stripe acquisition exemplifies the consumer wallet path. Supporting 75 million wallets across 1,000+ developer teams before acquisition, Privy excelled at React-focused integration with email and social login patterns familiar to Web2 developers. The Stripe integration follows their $1.1 billion Bridge acquisition for stablecoin infrastructure, signaling a comprehensive crypto payments stack combining fiat on-ramps, stable coins, and embedded wallets. This vertical integration mirrors Coinbase's strategy with their Base L2 plus embedded wallet infrastructure targeting "hundreds of millions of users."

Turnkey carved out differentiation through developer-first, open-source infrastructure with AWS Nitro Enclave security. Raising $50+ million including a $30 million Series B from Bain Capital Crypto, Turnkey powers Polymarket, Magic Eden, Alchemy, and Worldcoin with sub-second signing and 99.9% uptime guarantees. The open-source QuorumOS and comprehensive SDK suite appeal to developers building custom experiences requiring infrastructure-level control rather than opinionated UI components.

Web3Auth achieves remarkable scale with 20+ million monthly active users across 10,000+ applications, leveraging blockchain-agnostic architecture supporting 19+ social login providers. The distributed MPC approach with keys split across Torus Network nodes plus user devices enables true non-custodial wallets while maintaining Web2 UX patterns. At $69 monthly for the Growth plan versus Magic's $499 for comparable features, Web3Auth targets developer-led adoption through aggressive pricing and comprehensive platform support including Unity and Unreal Engine for gaming.

Dfns represents the fintech specialization strategy, partnering with Fidelity International, Standard Chartered's Zodia Custody, and ADQ's Tungsten Custody. Their $16 million Series A in January 2025 from Further Ventures/ADQ validates the institutional banking focus, with EU DORA and US FISMA regulatory alignment plus SOC-2 Type II certification. Supporting 40+ blockchains including Cosmos ecosystem chains, Dfns processes over $1 billion monthly transaction volume with 300% year-over-year growth since 2021.

Particle Network's full-stack chain abstraction approach differentiates through Universal Accounts providing a single address across 65+ blockchains with automatic cross-chain liquidity routing. The modular L1 blockchain (Particle Chain) coordinates multi-chain operations, enabling users to spend assets on any chain without manual bridging. BTC Connect launched as the first Bitcoin account abstraction implementation, demonstrating technical innovation beyond Ethereum-centric solutions.

The funding landscape reveals investor conviction in WaaS infrastructure as foundational Web3 building blocks. Fireblocks raised $1.04 billion over six rounds including a $550 million Series E at $8 billion valuation, backed by Sequoia Capital, Paradigm, and D1 Capital Partners. Turnkey, Privy, Dynamic, Portal, and Dfns collectively raised over $150 million in 2024-2025, with top-tier investors including a16z crypto, Bain Capital Crypto, Ribbit Capital, and Coinbase Ventures participating across multiple deals.

Partnership activity indicates ecosystem maturation. IBM's Digital Asset Haven partnership with Dfns targets transaction lifecycle management for banks and governments across 40 blockchains. McDonald's integration with Web3Auth for NFT collectibles (2,000 NFTs claimed in 15 minutes) demonstrates major Web2 brand adoption. Biconomy's support for Dynamic, Particle, Privy, Magic, Dfns, Capsule, Turnkey, and Web3Auth shows account abstraction infrastructure providers enabling interoperability across competing wallet solutions.

Developer experience: Integration time collapses from months to hours

The developer experience revolution in WaaS manifests through comprehensive SDK availability, with Web3Auth leading at 13+ framework support including JavaScript, React, Next.js, Vue, Angular, Android, iOS, React Native, Flutter, Unity, and Unreal Engine. This platform breadth enables identical wallet experiences across web, mobile native, and gaming environments—critical for applications spanning multiple surfaces. Privy focuses more narrowly on React ecosystem dominance with Next.js and Expo support, accepting framework limitations for deeper integration quality within that stack.

Integration time claims by major providers suggest the infrastructure has reached plug-and-play maturity. Web3Auth documents 15-minute basic integration with 4 lines of code, validated through integration builder tools generating ready-to-deploy code. Privy and Dynamic advertise similar timeframes for React-based applications, while Magic's npx make-magic scaffolding tool accelerates project setup. Only enterprise-focused Fireblocks and Turnkey quote days-to-weeks timelines, reflecting custom implementation requirements for institutional policy engines and compliance frameworks rather than SDK limitations.

API design converged around RESTful architectures rather than GraphQL, with webhook-based event notifications replacing persistent WebSocket connections across major providers. Turnkey's activity-based API model treats all actions as activities flowing through a policy engine, enabling granular permissions and comprehensive audit trails. Web3Auth's RESTful endpoints integrate with Auth0, AWS Cognito, and Firebase for federated identity, supporting custom JWT authentication for bring-your-own-auth scenarios. Dynamic's environment-based configuration through a developer dashboard balances ease-of-use with flexibility for multi-environment deployments.

Documentation quality separates leading providers from competitors. Web3Auth's integration builder generates framework-specific starter code, reducing cognitive load for developers unfamiliar with Web3 patterns. Turnkey's AI-ready documentation structure optimizes for LLM ingestion, enabling developers using Cursor or GPT-4 to receive accurate implementation guidance. Dynamic's CodeSandbox demos and multiple framework examples provide working references. Privy's starter templates and demo applications accelerate React integration, though less comprehensive than blockchain-agnostic competitors.

Onboarding flow options reveal strategic positioning through authentication method emphasis. Web3Auth's 19+ social login providers including Google, Twitter, Discord, GitHub, Facebook, Apple, LinkedIn, and regional options like WeChat, Kakao, and Line position for global reach. Custom JWT authentication enables enterprises to integrate existing identity systems. Privy emphasizes email-first with magic links, treating social logins as secondary options. Magic pioneered the magic link approach but now competes with more flexible alternatives. Turnkey's passkey-first architecture using WebAuthn standards positions for the passwordless future, supporting biometric authentication via Face ID, Touch ID, and hardware security keys.

Security model trade-offs emerge through key management implementations. Web3Auth's distributed MPC with Torus Network nodes plus user devices achieves non-custodial security through cryptographic distribution rather than centralized trust. Turnkey's AWS Nitro Enclave isolation ensures keys never leave hardware-protected environments, with cryptographic attestation proving code integrity. Privy's Shamir Secret Sharing approach splits keys across device and authentication factors, reconstructing only in isolated iframes during transaction signing. Magic's AWS HSM storage with AES-256 encryption accepts centralized key management trade-offs for operational simplicity, suitable for enterprise Web2 brands prioritizing convenience over self-custody.

White-labeling capabilities determine applicability for branded applications. Web3Auth offers the most comprehensive customization at accessible pricing ($69 monthly Growth plan), enabling modal and non-modal SDK options with full UI control. Turnkey's pre-built Embedded Wallet Kit balances convenience with low-level API access for custom interfaces. Dynamic's dashboard-based design controls streamline appearance configuration without code changes. The customization depth directly impacts whether WaaS infrastructure remains visible to end users or disappears behind brand-specific interfaces.

Code complexity analysis reveals the abstraction achievements. Web3Auth's modal integration requires just four lines—import, initialize with client ID, call initModal, then connect. Privy's React Provider wrapper approach integrates naturally with React component trees while maintaining isolation. Turnkey's more verbose setup reflects flexibility prioritization, with explicit configuration of organization IDs, passkey clients, and policy parameters. This complexity spectrum enables developer choice between opinionated simplicity and low-level control depending on use case requirements.

Community feedback through Stack Overflow, Reddit, and developer testimonials reveals patterns. Web3Auth users occasionally encounter breaking changes during version updates, typical for rapidly-evolving infrastructure. Privy's React dependency limits adoption for non-React projects, though acknowledges this trade-off consciously. Dynamic receives praise for responsive support, with testimonials describing the team as partners rather than vendors. Turnkey's professional documentation and Slack community appeal to teams prioritizing infrastructure understanding over managed services.

Real-world adoption: Gaming, DeFi, and NFTs drive usage at scale

Gaming applications demonstrate WaaS removing blockchain complexity at massive scale. Axie Infinity's integration with Ramp Network collapsed onboarding from 2 hours and 60 steps to just 12 minutes and 19 steps—a 90% time reduction and 30% step reduction enabling millions of players, particularly in the Philippines where 28.3% of traffic originates. This transformation allowed play-to-earn economics to function, with participants earning meaningful income through gaming. NBA Top Shot leveraged Dapper Wallet to onboard 800,000+ accounts generating $500+ million in sales, with credit card purchases and email login eliminating crypto complexity. The Flow blockchain's custom design for consumer-scale NFT transactions enables 9,000 transactions per second with near-zero gas fees, demonstrating infrastructure purpose-built for gaming economics.

DeFi platforms integrate embedded wallets to reduce friction from external wallet requirements. Leading decentralized exchanges like Uniswap, lending protocols like Aave, and derivatives platforms increasingly embed wallet functionality directly into trading interfaces. Fireblocks' enterprise WaaS serves exchanges, lending desks, and hedge funds requiring institutional custody combined with trading desk operations. The account abstraction wave enables gas sponsorship for DeFi applications, with 87% of ERC-4337 UserOperations utilizing Paymasters to cover $3.4 million in gas fees during 2024. This gas abstraction removes the bootstrapping problem where new users need tokens to pay for transactions acquiring their first tokens.

NFT marketplaces pioneered embedded wallet adoption to reduce checkout abandonment. Immutable X's integration with Magic wallet and MetaMask provides zero gas fees through Layer-2 scaling, processing thousands of NFT transactions per second for Gods Unchained and Illuvium. OpenSea's wallet connection flows support embedded options alongside external wallet connections, recognizing user preference diversity. The Dapper Wallet approach for NBA Top Shot and VIV3 demonstrates marketplace-specific embedded wallets can capture 95%+ of secondary market activity when UX optimization removes competing friction.

Enterprise adoption validates WaaS for financial institution use cases. Worldpay's Fireblocks integration delivered 50% faster payment processing with 24/7/365 T+0 settlements, diversifying revenue through blockchain payment rails while maintaining regulatory compliance. Coinbase WaaS targets household brands including partnerships with tokenproof, Floor, Moonray, and ENS Domains, positioning embedded wallets as infrastructure enabling Web2 companies to offer Web3 capabilities without blockchain engineering. Flipkart's integration with Fireblocks brings embedded wallets to India's massive e-commerce user base, while Grab in Singapore accepts crypto top-ups across Bitcoin, Ether, and stablecoins via Fireblocks infrastructure.

Consumer applications pursuing mainstream adoption rely on WaaS to abstract complexity. Starbucks Odyssey loyalty program uses custodial wallets with simplified UX for NFT-based rewards and token-gated experiences, demonstrating major retail brand Web3 experimentation. The Coinbase vision of "giving wallets to literally every human on the planet" through social media integration represents the ultimate mainstream play, with username/password onboarding and MPC key management replacing seed phrase requirements. This bridges the adoption chasm where technical complexity excludes non-technical users.

Geographic patterns reveal distinct regional adoption drivers. Asia-Pacific leads global growth with India receiving $338 billion in on-chain value during 2023-2024, driven by large diaspora remittances, young demographics, and existing UPI fintech infrastructure familiarity. Southeast Asia shows the fastest regional growth at 69% year-over-year to $2.36 trillion, with Vietnam, Indonesia, and the Philippines leveraging crypto for remittances, gaming, and savings. China's 956 million digital wallet users with 90%+ urban adult penetration demonstrate mobile payment infrastructure preparing populations for crypto integration. Latin America's 50% annual adoption increase stems from currency devaluation concerns and remittance needs, with Brazil and Mexico leading. Africa's 35% increase in active mobile money users positions the continent for leapfrogging traditional banking infrastructure through crypto wallets.

North America focuses on institutional and enterprise adoption with regulatory clarity emphasis. The US contributes 36.92% of global market share with 70% of online adults using digital payments, though fewer than 60% of small businesses accept digital wallets—an adoption gap WaaS providers target. Europe shows 52% of online shoppers favoring digital wallets over legacy payment methods, with MiCA regulations providing clarity enabling institutional adoption acceleration.

Adoption metrics validate market trajectory. Global digital wallet users reached 5.6 billion in 2025 with projections for 5.8 billion by 2029, representing 35% growth from 4.3 billion in 2024. Digital wallets now account for 49-56% of global e-commerce transaction value at $14-16 trillion annually. The Web3 wallet security market alone is projected to reach $68.8 billion by 2033 at 23.7% CAGR, with 820 million unique crypto addresses active in 2025. Leading providers support tens to hundreds of millions of wallets: Privy with 75 million, Dynamic with 50+ million, Web3Auth with 20+ million monthly active users, and Fireblocks securing 130+ million wallets.

Blockchain support: Universal EVM coverage with expanding non-EVM ecosystems

The blockchain ecosystem support landscape bifurcates between providers pursuing universal coverage through curve-based architectures versus those integrating chains individually. Turnkey and Web3Auth achieve blockchain-agnostic support through secp256k1 and ed25519 curve signing, automatically supporting any new blockchain utilizing these cryptographic primitives without provider intervention. This architecture future-proofs infrastructure as new chains launch—Berachain and Monad receive day-one Turnkey support through curve compatibility rather than explicit integration work.

Fireblocks takes the opposite approach with explicit integrations across 80+ blockchains, fastest in adding new chains through institutional focus requiring comprehensive feature support per chain. Recent additions include Cosmos ecosystem expansion in May 2024 adding Osmosis, Celestia, dYdX, Axelar, Injective, Kava, and Thorchain. November 2024 brought Unichain support immediately at launch, while World Chain integration followed in August 2024. This velocity stems from modular architecture and institutional client demand for comprehensive chain coverage including staking, DeFi protocols, and WalletConnect integration per chain.

EVM Layer-2 scaling solutions achieve universal support across major providers. Base, Arbitrum, and Optimism receive unanimous support from Magic, Web3Auth, Dynamic, Privy, Turnkey, Fireblocks, and Particle Network. Base's explosive growth as the highest-revenue Layer-2 by late 2024 validates Coinbase's infrastructure bet, with WaaS providers prioritizing integration given Base's institutional backing and developer momentum. Arbitrum maintains 40% Layer-2 market share with largest total value locked, while Optimism benefits from Superchain ecosystem effects as multiple projects deploy OP Stack rollups.

ZK-rollup support shows more fragmentation despite technical advantages. Linea achieves the highest TVL among ZK rollups at $450-700 million backed by ConsenSys, with Fireblocks, Particle Network, Web3Auth, Turnkey, and Privy providing support. zkSync Era garners Web3Auth, Privy, Turnkey, and Particle Network integration despite market share challenges following controversial token launch. Scroll receives support from Web3Auth, Turnkey, Privy, and Particle Network serving developers with 85+ integrated protocols. Polygon zkEVM benefits from Polygon ecosystem association with Fireblocks, Web3Auth, Turnkey, and Privy support. The ZK-rollup fragmentation reflects technical complexity and lower usage compared to Optimistic rollups, though long-term scalability advantages suggest increasing attention.

Non-EVM blockchain support reveals strategic positioning differences. Solana achieves near-universal support through ed25519 curve compatibility and market momentum, with Web3Auth, Dynamic, Privy, Turnkey, Fireblocks, and Particle Network providing full integration. Particle Network's Solana Universal Accounts integration demonstrates chain abstraction extending beyond EVM to high-performance alternatives. Bitcoin support appears in Dynamic, Privy, Turnkey, Fireblocks, and Particle Network offerings, with Particle's BTC Connect representing the first Bitcoin account abstraction implementation enabling programmable Bitcoin wallets without Lightning Network complexity.

Cosmos ecosystem support concentrates in Fireblocks following their May 2024 strategic expansion. Supporting Cosmos Hub, Osmosis, Celestia, dYdX, Axelar, Kava, Injective, and Thorchain with plans for Sei, Noble, and Berachain additions, Fireblocks positions for inter-blockchain communication protocol dominance. Web3Auth provides broader Cosmos compatibility through curve support, while other providers offer selective integration based on client demand rather than ecosystem-wide coverage.

Emerging layer-1 blockchains receive varying attention. Turnkey added Sui and Sei support reflecting ed25519 and Ethereum compatibility respectively. Aptos receives Web3Auth support with Privy planning Q1 2025 integration, positioning for Move language ecosystem growth. Near, Polkadot, Kusama, Flow, and Tezos appear in Web3Auth's blockchain-agnostic catalog through private key export capabilities. TON integration appeared in Fireblocks offerings targeting Telegram ecosystem opportunities. Algorand and Stellar receive Fireblocks support for institutional applications in payment and tokenization use cases.

Cross-chain architecture approaches determine future-proofing. Particle Network's Universal Accounts provide single addresses across 65+ blockchains with automatic cross-chain liquidity routing through their modular L1 coordination layer. Users maintain unified balances and spend assets on any chain without manual bridging, paying gas fees in any token. Magic's Newton network announced November 2024 integrates with Polygon's AggLayer for chain unification focused on wallet-level abstraction. Turnkey's curve-based universal support achieves similar outcomes through cryptographic primitives rather than coordination infrastructure. Web3Auth's blockchain-agnostic authentication with private key export enables developers to integrate any chain through standard libraries.

Chain-specific optimizations appear in provider implementations. Fireblocks supports staking across multiple Proof-of-Stake chains including Ethereum, Cosmos ecosystem chains, Solana, and Algorand with institutional-grade security. Particle Network optimized for gaming workloads with session keys, gasless transactions, and rapid account creation. Web3Auth's plug-and-play modal optimizes for rapid multi-chain wallet generation without customization requirements. Dynamic's wallet adapter supports 500+ external wallets across ecosystems, enabling users to connect existing wallets rather than creating new embedded accounts.

Roadmap announcements indicate continued expansion. Fireblocks committed to supporting Berachain at mainnet launch, Sei integration, and Noble for USDC-native Cosmos operations. Privy announced Aptos and Move ecosystem support for Q1 2025, expanding beyond EVM and Solana focus. Magic's Newton mainnet launch from private testnet brings AggLayer integration to production. Particle Network continues expanding Universal Accounts to additional non-EVM chains with enhanced cross-chain liquidity features. The architectural approaches suggest two paths forward: comprehensive individual integrations for institutional features versus universal curve-based support for developer flexibility and automatic new chain compatibility.

Regulatory landscape: MiCA brings clarity while US frameworks evolve

The regulatory environment for WaaS providers transformed substantially in 2024-2025 through comprehensive frameworks emerging in major jurisdictions. The EU's Markets in Crypto-Assets (MiCA) regulation taking full effect in December 2024 establishes the world's most comprehensive crypto regulatory framework, requiring Crypto Asset Service Provider authorization for any entity offering custody, transfer, or exchange services. MiCA introduces consumer protection requirements including capital reserves, operational resilience standards, cybersecurity frameworks, and conflict of interest disclosures while providing a regulatory passport enabling CASP-authorized providers to operate across all 27 EU member states.

Custody model determination drives regulatory classification and obligations. Custodial wallet providers automatically qualify as VASPs/CASPs/MSBs requiring full financial services licensing, KYC/AML programs, Travel Rule compliance, capital requirements, and regular audits. Fireblocks, Coinbase WaaS, and enterprise-focused providers deliberately accept these obligations to serve institutional clients requiring regulated counterparties. Non-custodial wallet providers like Turnkey and Web3Auth generally avoid VASP classification by demonstrating users control private keys, though must carefully structure offerings to maintain this distinction. Hybrid MPC models face ambiguous treatment depending on whether providers control majority key shares—a critical architectural decision with profound regulatory implications.

KYC/AML compliance requirements vary by jurisdiction but universally apply to custodial providers. FATF Recommendations require VASPs to implement customer due diligence, suspicious activity monitoring, and transaction reporting. Major providers integrate with specialized compliance technology: Chainalysis for transaction screening and wallet analysis, Elliptic for risk scoring and sanctions screening, Sumsub for identity verification with liveness detection and biometrics. TRM Labs, Crystal Intelligence, and Merkle Science provide complementary transaction monitoring and behavior detection. Integration approaches range from native built-in compliance (Fireblocks with integrated Elliptic/Chainalysis) to bring-your-own-key configurations letting customers use existing provider contracts.

Travel Rule compliance presents operational complexity as 65+ jurisdictions mandate VASP-to-VASP information exchange for transactions above threshold amounts (typically $1,000 USD equivalent, though Singapore requires $1,500 and Switzerland $1,000). FATF's June 2024 report found only 26% of implementing jurisdictions have taken enforcement actions, though compliance adoption accelerated with virtual asset transaction volume using Travel Rule tools increasing. Providers implement through protocols including Global Travel Rule Protocol, Travel Rule Protocol, and CODE, with Notabene providing VASP directory services. Sumsub offers multi-protocol support balancing compliance across jurisdictional variations.

The United States regulatory landscape shifted dramatically with the Trump administration's pro-crypto stance beginning January 2025. The administration's crypto task force charter established in March 2025 aims to clarify SEC jurisdiction and potentially repeal SAB 121. The Genius Act for stablecoin regulation and FIT21 for digital commodities advance through Congress with bipartisan support. State-level complexity persists with money transmitter licensing required in 48+ states, each with distinct capital requirements, bonding rules, and approval timelines ranging from 6-24 months. FinCEN registration as a Money Services Business provides federal baseline, supplementing rather than replacing state requirements.

Singapore's Monetary Authority maintains leadership in Asia-Pacific through Payment Services Act licensing distinguishing Standard Payment Institution licenses (≤SGD 5 million monthly) from Major Payment Institution licenses (>SGD 5 million), with SGD 250,000 minimum base capital. The August 2023 stablecoin framework specifically addresses payment-focused digital currencies, enabling Grab's crypto top-up integration and institutional partnerships like Dfns with Singapore-based custody providers. Japan's Financial Services Agency enforces strict requirements including 95% cold storage, asset segregation, and Japanese subsidiary establishment for most foreign providers. Hong Kong's Securities and Futures Commission implements ASPIRe framework with platform operator licensing and mandatory insurance requirements.

Privacy regulations create technical challenges for blockchain implementations. GDPR's right to erasure conflicts with blockchain immutability, with EDPB April 2024 guidelines recommending off-chain personal data storage, on-chain hashing for references, and encryption standards. Implementation requires separating personally identifiable information from blockchain transactions, storing sensitive data in encrypted off-chain databases controllable by users. 63% of DeFi platforms fail right to erasure compliance according to 2024 assessments, indicating technical debt many providers carry. CCPA/CPRA requirements in California largely align with GDPR principles, with 53% of US crypto firms now subject to California's framework.

Regional licensing comparison reveals substantial variation in complexity and cost. EU MiCA CASP authorization requires 6-12 months with costs varying by member state but providing 27-country passport, making single application economically efficient for European operations. US licensing combines federal MSB registration (6-month typical timeline) with 48+ state money transmitter licenses requiring 6-24 months with costs exceeding $1 million for comprehensive coverage. Singapore MAS licensing takes 6-12 months with SGD 250,000 capital for SPI, while Japan CAES registration typically requires 12-18 months with Japanese subsidiary establishment preferred. Hong Kong VASP licensing through SFC takes 6-12 months with insurance requirements, while UK FCA registration requires 6-12 months with £50,000+ capital and AML/CFT compliance.

Compliance technology costs and operational requirements create barriers to entry favoring well-funded providers. Licensing fees range from $100,000 to $1+ million across jurisdictions, while annual compliance technology subscriptions cost $50,000-500,000 for KYC, AML, and transaction monitoring tools. Legal and consulting expenses typically reach $200,000-1,000,000+ annually for multi-jurisdictional operations, with dedicated compliance teams costing $500,000-2,000,000+ in personnel expenses. Regular audits and certifications (SOC 2 Type II, ISO 27001) add $50,000-200,000 annually. Total compliance infrastructure commonly exceeds $2-5 million in first-year setup costs for multi-jurisdictional providers, creating moats around established players while limiting new entrant competition.

Innovation frontiers: Account abstraction and AI reshape wallet paradigms

Account abstraction represents the most transformative infrastructure innovation since Ethereum's launch, with ERC-4337 UserOperations surging 1,140% to 103 million in 2024 compared to 8.3 million in 2023. The standard introduces smart contract wallets without requiring protocol changes, enabling gas sponsorship, batched transactions, social recovery, and session keys through a parallel transaction execution system. Bundlers aggregate UserOperations into single transactions submitted to the EntryPoint contract, with Coinbase processing 30+ million operations primarily on Base, Alchemy deploying 58% of new smart accounts, and Pimlico, Biconomy, and Particle providing complementary infrastructure.

Paymaster adoption demonstrates killer application viability. 87% of all UserOperations utilized Paymasters to sponsor gas fees, covering $3.4 million in transaction costs during 2024. This gas abstraction solves the bootstrapping problem where users need tokens to pay for acquiring their first tokens, enabling true frictionless onboarding. Verifying Paymasters link off-chain verification to on-chain execution, while Depositing Paymasters maintain on-chain balances covering batched user operations. Multi-round validation enables sophisticated spending policies without users managing gas strategies.

EIP-7702 launched with the Pectra upgrade on May 7, 2025, introducing Type 4 transactions enabling EOAs to delegate code execution to smart contracts. This bridges account abstraction benefits to existing externally-owned accounts without requiring asset migration or new address generation. Users maintain original addresses while gaining smart contract capabilities selectively, with MetaMask, Rainbow, and Uniswap implementing initial support. The authorization list mechanism enables temporary or permanent delegation, backward compatible with ERC-4337 infrastructure while solving adoption friction from account migration requirements.

Passkey integration eliminates seed phrases as authentication primitives, with biometric device security replacing memorization and physical backup requirements. Coinbase Smart Wallet pioneered at-scale passkey wallet creation using WebAuthn/FIDO2 standards, though security audits identified concerns around user verification requirements and Windows 11 device-bound passkey cloud sync limitations. Web3Auth, Dynamic, Turnkey, and Portal implement passkey-authorized MPC sessions where biometric authentication controls wallet access and transaction signing without directly exposing private keys. EIP-7212 precompile support for P-256 signature verification reduces gas costs for passkey transactions on Ethereum and compatible chains.

The technical challenge of passkey-blockchain integration stems from curve incompatibilities. WebAuthn uses P-256 (secp256r1) curves while most blockchains expect secp256k1 (Ethereum, Bitcoin) or ed25519 (Solana). Direct passkey signing would require expensive on-chain verification or protocol modifications, so most implementations use passkeys to authorize MPC operations rather than direct transaction signing. This architecture maintains security properties while achieving cryptographic compatibility across blockchain ecosystems.

AI integration transforms wallets from passive key storage into intelligent financial assistants. The AI in FinTech market projects growth from $14.79 billion in 2024 to $43.04 billion by 2029 at 23.82% CAGR, with crypto wallets representing substantial adoption. Fraud detection leverages machine learning for anomaly detection, behavioral pattern analysis, and real-time phishing identification—MetaMask's Wallet Guard integration exemplifies AI-powered threat prevention. Transaction optimization through predictive gas fee models analyzing network congestion, optimal timing recommendations, and MEV protection delivers measurable cost savings averaging 15-30% versus naive timing.

Portfolio management AI features include asset allocation recommendations, risk tolerance profiling with automatic rebalancing, yield farming opportunity identification across DeFi protocols, and performance analytics with trend prediction. Rasper AI markets as the first self-custodial AI wallet with portfolio advisor functionality, real-time threat and volatility alerts, and multi-currency behavioral trend tracking. ASI Wallet from Fetch.ai provides privacy-focused AI-native experiences with portfolio tracking and predictive insights integrated with Cosmos ecosystem agent-based interactions.

Natural language interfaces represent the killer application for mainstream adoption. Conversational AI enables users to execute transactions through voice or text commands without understanding blockchain mechanics—"send 10 USDC to Alice" automatically resolves names, checks balances, estimates gas, and executes across appropriate chains. The Zebu Live panel featuring speakers from Base, Rhinestone, Zerion, and Askgina.ai articulated the vision: future users won't think about gas fees or key management, as AI handles complexity invisibly. Intent-based architectures where users specify desired outcomes rather than transaction mechanics shift cognitive load from users to protocol infrastructure.

Zero-knowledge proof adoption accelerates through Google's ZKP integration announced May 2, 2025 for age verification in Google Wallet, with open-source libraries released July 3, 2025 via github.com/google/longfellow-zk. Users prove attributes like age over 18 without revealing birthdates, with first partner Bumble implementing for dating app verification. EU eIDAS regulation encouraging ZKP in European Digital Identity Wallet planned for 2026 launch drives standardization. The expansion targets 50+ countries for passport validation, health service access, and attribute verification while maintaining privacy.

Layer-2 ZK rollup adoption demonstrates scalability breakthroughs. Polygon zkEVM TVL surpassed $312 million in Q1 2025 representing 240% year-over-year growth, while zkSync Era saw 276% increase in daily transactions. StarkWare's S-two mobile prover enables local proof generation on laptops and phones, democratizing ZK proof creation beyond specialized hardware. ZK-rollups bundle hundreds of transactions into single proofs verified on-chain, delivering 100-1000x scalability improvements while maintaining security properties through cryptographic guarantees rather than optimistic fraud proof assumptions.

Quantum-resistant cryptography research intensifies as threat timelines crystallize. NIST standardized post-quantum algorithms including CRYSTALS-Kyber for key encapsulation and CRYSTALS-Dilithium for digital signatures in November 2024, with SEALSQ's QS7001 Secure Element launching May 21, 2025 as first Bitcoin hardware wallet implementing NIST-compliant post-quantum cryptography. The hybrid approach combining ECDSA and Dilithium signatures enables backward compatibility during transition periods. BTQ Technologies' Bitcoin Quantum launched October 2025 as the first NIST-compliant quantum-safe Bitcoin implementation capable of 1 million+ post-quantum signatures per second.

Decentralized identity standards mature toward mainstream adoption. W3C DID specifications define globally unique, user-controlled identifiers blockchain-anchored for immutability without central authorities. Verifiable Credentials enable digital, cryptographically-signed credentials issued by trusted entities, stored in user wallets, and verified without contacting issuers. The European Digital Identity Wallet launching 2026 will require EU member states to provide interoperable cross-border digital ID with ZKP-based selective disclosure, potentially impacting 450+ million residents. Digital identity market projections reach $200+ billion by 2034, with 25-35% of digital IDs expected to be decentralized by 2035 as 60% of countries explore decentralized frameworks.

Cross-chain interoperability protocols address fragmentation across 300+ blockchain networks. Chainlink CCIP integrated 60+ blockchains as of 2025, leveraging battle-tested Decentralized Oracle Networks securing $100+ billion TVL for token-agnostic secure transfers. Recent integrations include Stellar through Chainlink Scale and TON for Toncoin cross-chain transfers. Arcana Chain Abstraction SDK launched January 2025 provides unified balances across Ethereum, Polygon, Arbitrum, Base, and Optimism with stablecoin gas payments and automatic liquidity routing. Particle Network's Universal Accounts deliver single addresses across 65+ chains with intent-based transaction execution abstracting chain selection entirely from user decisions.

Price comparisons

WalletsTHIRDWEBPRIVYDYNAMICWEB3 AUTHMAGIC LINK
10,000$150 Total
($0.015/wallet)
$499 Total
($0.049/wallet)
$500 Total
($0.05/wallet)
$400 Total
($0.04/wallet)
$500 Total
($0.05/wallet)
100,000$1,485 Total
($0.01485/wallet)
Enterprise pricing
(talk to sales)
$5,000 Total
($0.05/wallet)
$4,000 Total
($0.04/wallet)
$5,000 Total
($0.05/wallet)
1,000,000$10,485 Total
($0.0104/wallet)
Enterprise pricing
(talk to sales)
$50,000 Total
($0.05/wallet)
$40,000 Total
($0.04/wallet)
$50,000 Total
($0.05/wallet)
10,000,000$78,000 Total
($0.0078/wallet)
Enterprise pricing
(talk to sales)
Enterprise pricing
(talk to sales)
$400,000 Total
($0.04/wallet)
Enterprise pricing
(talk to sales)
100,000,000$528,000 Total
($0.00528/wallet)
Enterprise pricing
(talk to sales)
Enterprise pricing
(talk to sales)
$4,000,000 Total
($0.04/wallet)
Enterprise pricing
(talk to sales)

Strategic imperatives for builders and enterprises

WaaS infrastructure selection requires evaluating security models, regulatory positioning, blockchain coverage, and developer experience against specific use case requirements. Institutional applications prioritize Fireblocks or Turnkey for SOC 2 Type II certification, comprehensive audit trails, policy engines enabling multi-approval workflows, and established regulatory relationships. Fireblocks' $8 billion valuation and $10+ trillion in secured transfers provides institutional credibility, while Turnkey's AWS Nitro Enclave architecture and open-source approach appeals to teams requiring infrastructure transparency.

Consumer applications optimize for conversion rates through frictionless onboarding. Privy excels for React-focused teams requiring rapid integration with email and social login, now backed by Stripe's resources and payment infrastructure. Web3Auth provides blockchain-agnostic support for teams targeting multiple chains and frameworks, with 19+ social login options at $69 monthly making it economically accessible for startups. Dynamic's acquisition by Fireblocks creates a unified custody-to-consumer offering combining institutional security with developer-friendly embedded wallets.

Gaming and metaverse applications benefit from specialized features. Web3Auth's Unity and Unreal Engine SDKs remain unique among major providers, critical for game developers working outside web frameworks. Particle Network's session keys enable gasless in-game transactions with user-authorized spending limits, while account abstraction batching allows complex multi-step game actions in single transactions. Consider gas sponsorship requirements carefully—game economies with high transaction frequencies require either Layer-2 deployment or substantial Paymaster budgets.

Multi-chain applications must evaluate architectural approaches. Curve-based universal support from Turnkey and Web3Auth automatically covers new chains at launch without provider integration dependencies, future-proofing against blockchain proliferation. Fireblocks' comprehensive individual integrations provide deeper chain-specific features like staking and DeFi protocol access. Particle Network's Universal Accounts represent the bleeding edge with true chain abstraction through coordination infrastructure, suitable for applications willing to integrate novel architectures for superior UX.

Regulatory compliance requirements vary drastically by business model. Custodial models trigger full VASP/CASP licensing across jurisdictions, requiring $2-5 million first-year compliance infrastructure investment and 12-24 month licensing timelines. Non-custodial approaches using MPC or smart contract wallets avoid most custody regulations but must carefully structure key control to maintain classification. Hybrid models require legal analysis for each jurisdiction, as determination depends on subtle implementation details around key recovery and backup procedures.

Cost considerations extend beyond transparent pricing to total cost of ownership. Transaction-based pricing creates unpredictable scaling costs for high-volume applications, while monthly active wallet pricing penalizes user growth. Evaluate provider lock-in risks through private key export capabilities and standard derivation path support enabling migration without user disruption. Infrastructure providers with vendor lock-in through proprietary key management create switching costs hindering future flexibility.

Developer experience factors compound over application lifetime. Integration time represents one-time cost, but SDK quality, documentation completeness, and support responsiveness impact ongoing development velocity. Web3Auth, Turnkey, and Dynamic receive consistent praise for documentation quality, while some providers require sales contact for basic integration questions. Active developer communities on GitHub, Discord, and Stack Overflow indicate ecosystem health and knowledge base availability.

Security certification requirements depend on customer expectations. SOC 2 Type II certification reassures enterprise buyers about operational controls and security practices, often required for procurement approval. ISO 27001/27017/27018 certifications demonstrate international security standard compliance. Regular third-party security audits from reputable firms like Trail of Bits, OpenZeppelin, or Consensys Diligence validate smart contract and infrastructure security. Insurance coverage for assets in storage and transit differentiates institutional-grade providers, with Fireblocks offering policies covering the digital asset lifecycle.

Future-proofing strategies require quantum readiness planning. While cryptographically-relevant quantum computers remain 10-20 years away, the "harvest now, decrypt later" threat model makes post-quantum planning urgent for long-lived assets. Evaluate providers' quantum resistance roadmaps and crypto-agile architectures enabling algorithm transitions without user disruption. Hardware wallet integrations supporting Dilithium or FALCON signatures future-proof high-value custody, while protocol participation in NIST standardization processes signals commitment to quantum readiness.

Account abstraction adoption timing represents strategic decision. ERC-4337 and EIP-7702 provide production-ready infrastructure for gas sponsorship, social recovery, and session keys—features dramatically improving conversion rates and reducing support burden from lost access. However, smart account deployment costs and ongoing transaction overhead require careful cost-benefit analysis. Layer-2 deployment mitigates gas concerns while maintaining security properties, with Base, Arbitrum, and Optimism offering robust account abstraction infrastructure.

The WaaS landscape continues rapid evolution with consolidation around platform players building full-stack solutions. Stripe's Privy acquisition and vertical integration with Bridge stablecoins signals Web2 payment giants recognizing crypto infrastructure criticality. Fireblocks' Dynamic acquisition creates custody-to-consumer offerings competing with Coinbase's integrated approach. This consolidation favors providers with clear positioning—best-in-class institutional security, superior developer experience, or innovative chain abstraction—over undifferentiated middle-market players.

For builders deploying WaaS infrastructure in 2024-2025, prioritize providers with comprehensive account abstraction support, passwordless authentication roadmaps, multi-chain coverage through curve-based or abstraction architectures, and regulatory compliance frameworks matching your business model. The infrastructure has matured from experimental to production-grade, with proven implementations powering billions in transaction volume across gaming, DeFi, NFTs, and enterprise applications. The winners in Web3's next growth phase will be those leveraging WaaS to deliver Web2 user experiences powered by Web3's programmable money, composable protocols, and user-controlled digital assets.

Google’s Agent Payments Protocol (AP2)

· 34 min read
Dora Noda
Software Engineer

Google’s Agent Payments Protocol (AP2) is a newly announced open standard designed to enable secure, trustworthy transactions initiated by AI agents on behalf of users. Developed in collaboration with over 60 payments and technology organizations (including major payment networks, banks, fintechs, and Web3 companies), AP2 establishes a common language for “agentic” payments – i.e. purchases and financial transactions that an autonomous agent (such as an AI assistant or LLM-based agent) can carry out for a user. AP2’s creation is driven by a fundamental shift: traditionally, online payment systems assumed a human is directly clicking “buy,” but the rise of AI agents acting on user instructions breaks this assumption. AP2 addresses the resulting challenges of authorization, authenticity, and accountability in AI-driven commerce, while remaining compatible with existing payment infrastructure. This report examines AP2’s technical architecture, purpose and use cases, integrations with AI agents and payment providers, security and compliance considerations, comparisons to existing protocols, implications for Web3/decentralized systems, and the industry adoption/roadmap.

Technical Architecture: How AP2 Works

At its core, AP2 introduces a cryptographically secure transaction framework built on verifiable digital credentials (VDCs) – essentially tamper-proof, signed data objects that serve as digital “contracts” of what the user has authorized. In AP2 terminology these contracts are called Mandates, and they form an auditable chain of evidence for each transaction. There are three primary types of mandates in the AP2 architecture:

  • Intent Mandate: Captures the user’s initial instructions or conditions for a purchase, especially for “human-not-present” scenarios (where the agent will act later without the user online). It defines the scope of authority the user gives the agent – for example, “Buy concert tickets if they drop below $200, up to 2 tickets”. This mandate is cryptographically signed upfront by the user and serves as verifiable proof of consent within specific limits.
  • Cart Mandate: Represents the final transaction details that the user has approved, used in “human-present” scenarios or at the moment of checkout. It includes the exact items or services, their price, and other particulars of the purchase. When the agent is ready to complete the transaction (e.g. after filling a shopping cart), the merchant first cryptographically signs the cart contents (guaranteeing the order details and price), and then the user (via their device or agent interface) signs off to create a Cart Mandate. This ensures what-you-see-is-what-you-pay, locking in the final order exactly as presented to the user.
  • Payment Mandate: A separate credential that is sent to the payment network (e.g. card network or bank) to signal that an AI agent is involved in the transaction. The Payment Mandate includes metadata such as whether the user was present or not during authorization and serves as a flag for risk management systems. By providing the acquiring and issuing banks with cryptographically verifiable evidence of user intent, this mandate helps them assess the context (for example, distinguishing an agent-initiated purchase from typical fraud) and manage compliance or liability accordingly.

All mandates are implemented as verifiable credentials signed by the relevant party’s keys (user, merchant, etc.), yielding a non-repudiable audit trail for every agent-led transaction. In practice, AP2 uses a role-based architecture to protect sensitive information – for instance, an agent might handle an Intent Mandate without ever seeing raw payment details, which are only revealed in a controlled way when needed, preserving privacy. The cryptographic chain of user intent → merchant commitment → payment authorization establishes trust among all parties that the transaction reflects the user’s true instructions and that both the agent and merchant adhered to those instructions.

Transaction Flow: To illustrate how AP2 works end-to-end, consider a simple purchase scenario with a human in the loop:

  1. User Request: The user asks their AI agent to purchase a particular item or service (e.g. “Order this pair of shoes in my size”).
  2. Cart Construction: The agent communicates with the merchant’s systems (using standard APIs or via an agent-to-agent interaction) to assemble a shopping cart for the specified item at a given price.
  3. Merchant Guarantee: Before presenting the cart to the user, the merchant’s side cryptographically signs the cart details (item, quantity, price, etc.). This step creates a merchant-signed offer that guarantees the exact terms (preventing any hidden changes or price manipulation).
  4. User Approval: The agent shows the user the finalized cart. The user confirms the purchase, and this approval triggers two cryptographic signatures from the user’s side: one on the Cart Mandate (to accept the merchant’s cart as-is) and one on the Payment Mandate (to authorize payment through the chosen payment provider). These signed mandates are then shared with the merchant and the payment network respectively.
  5. Execution: Armed with the Cart Mandate and Payment Mandate, the merchant and payment provider proceed to execute the transaction securely. For example, the merchant submits the payment request along with the proof of user approval to the payment network (card network, bank, etc.), which can verify the Payment Mandate. The result is a completed purchase transaction with a cryptographic audit trail linking the user’s intent to the final payment.

This flow demonstrates how AP2 builds trust into each step of an AI-driven purchase. The merchant has cryptographic proof of exactly what the user agreed to buy at what price, and the issuer/bank has proof that the user authorized that payment, even though an AI agent facilitated the process. In case of disputes or errors, the signed mandates act as clear evidence, helping determine accountability (e.g. if the agent deviated from instructions or if a charge was not what the user approved). In essence, AP2’s architecture ensures that verifiable user intent – rather than trust in the agent’s behavior – is the basis of the transaction, greatly reducing ambiguity.

Purpose and Use Cases for AP2

Why AP2 is Needed: The primary purpose of AP2 is to solve emerging trust and security issues that arise when AI agents can spend money on behalf of users. Google and its partners identified several key questions that today’s payment infrastructure cannot adequately answer when an autonomous agent is in the loop:

  • Authorization: How to prove that a user actually gave the agent permission to make a specific purchase? (In other words, ensuring the agent isn’t buying things without the user’s informed consent.)
  • Authenticity: How can a merchant know that an agent’s purchase request is genuine and reflects the user’s true intent, rather than a mistake or AI hallucination?
  • Accountability: If a fraudulent or incorrect transaction occurs via an agent, who is responsible – the user, the merchant, the payment provider, or the creator of the AI agent?

Without a solution, these uncertainties create a “crisis of trust” around agent-led commerce. AP2’s mission is to provide that solution by establishing a uniform protocol for secure agent transactions. By introducing standardized mandates and proofs of intent, AP2 prevents a fragmented ecosystem of each company inventing its own ad-hoc agent payment methods. Instead, any compliant AI agent can interact with any compliant merchant/payment provider under a common set of rules and verifications. This consistency not only avoids user and merchant confusion, but also gives financial institutions a clear way to manage risk for agent-initiated payments, rather than dealing with a patchwork of proprietary approaches. In short, AP2’s purpose is to be a foundational trust layer that lets the “agent economy” grow without breaking the payments ecosystem.

Intended Use Cases: By solving the above issues, AP2 opens the door to new commerce experiences and use cases that go beyond what’s possible with a human manually clicking through purchases. Some examples of agent-enabled commerce that AP2 supports include:

  • Smarter Shopping: A customer can instruct their agent, “I want this winter jacket in green, and I’m willing to pay up to 20% above the current price for it”. Armed with an Intent Mandate encoding these conditions, the agent will continuously monitor retailer websites or databases. The moment the jacket becomes available in green (and within the price threshold), the agent automatically executes a purchase with a secure, signed transaction – capturing a sale that otherwise would have been missed. The entire interaction, from the user’s initial request to the automated checkout, is governed by AP2 mandates ensuring the agent only buys exactly what was authorized.
  • Personalized Offers: A user tells their agent they’re looking for a specific product (say, a new bicycle) from a particular merchant for an upcoming trip. The agent can share this interest (within the bounds of an Intent Mandate) with the merchant’s own AI agent, including relevant context like the trip date. The merchant agent, knowing the user’s intent and context, could respond with a custom bundle or discount – for example, “bicycle + helmet + travel rack at 15% off, available for the next 48 hours.” Using AP2, the user’s agent can accept and complete this tailored offer securely, turning a simple query into a more valuable sale for the merchant.
  • Coordinated Tasks: A user planning a complex task (e.g. a weekend trip) delegates it entirely: “Book me a flight and hotel for these dates with a total budget of $700.” The agent can interact with multiple service providers’ agents – airlines, hotels, travel platforms – to find a combination that fits the budget. Once a suitable flight-hotel package is identified, the agent uses AP2 to execute multiple bookings in one go, each cryptographically signed (for example, issuing separate Cart Mandates for the airline and the hotel, both authorized under the user’s Intent Mandate). AP2 ensures all parts of this coordinated transaction occur as approved, and even allows simultaneous execution so that tickets and reservations are booked together without risk of one part failing mid-way.

These scenarios illustrate just a few of AP2’s intended use cases. More broadly, AP2’s flexible design supports both conventional e-commerce flows and entirely new models of commerce. For instance, AP2 can facilitate subscription-like services (an agent keeps you stocked on essentials by purchasing when conditions are met), event-driven purchases (buying tickets or items the instant a trigger event occurs), group agent negotiations (multiple users’ agents pooling mandates to bargain for a group deal), and many other emerging patterns. In every case, the common thread is that AP2 provides the trust framework – clear user authorization and cryptographic auditability – that allows these agent-driven transactions to happen safely. By handling the trust and verification layer, AP2 lets developers and businesses focus on innovating new AI commerce experiences without re-inventing payment security from scratch.

Integration with Agents, LLMs, and Payment Providers

AP2 is explicitly designed to integrate seamlessly with AI agent frameworks and with existing payment systems, acting as a bridge between the two. Google has positioned AP2 as an extension of its Agent2Agent (A2A) protocol and Model Context Protocol (MCP) standards. In other words, if A2A provides a generic language for agents to communicate tasks and MCP standardizes how AI models incorporate context/tools, then AP2 adds a transactions layer on top for commerce. The protocols are complementary: A2A handles agent-to-agent communication (allowing, say, a shopping agent to talk to a merchant’s agent), while AP2 handles agent-to-merchant payment authorization within those interactions. Because AP2 is open and non-proprietary, it’s meant to be framework-agnostic: developers can use it with Google’s own Agent Development Kit (ADK) or any AI agent library, and likewise it can work with various AI models including LLMs. An LLM-based agent, for example, could use AP2 by generating and exchanging the required mandate payloads (guided by the AP2 spec) instead of just free-form text. By enforcing a structured protocol, AP2 helps transform an AI agent’s high-level intent (which might come from an LLM’s reasoning) into concrete, secure transactions.

On the payments side, AP2 was built in concert with traditional payment providers and standards, rather than as a rip-and-replace system. The protocol is payment-method-agnostic, meaning it can support a variety of payment rails – from credit/debit card networks to bank transfers and digital wallets – as the underlying method for moving funds. In its initial version, AP2 emphasizes compatibility with card payments, since those are most common in online commerce. The AP2 Payment Mandate is designed to plug into the existing card processing flow: it provides additional data to the payment network (e.g. Visa, Mastercard, Amex) and issuing bank that an AI agent is involved and whether the user was present, thereby complementing existing fraud detection and authorization checks. Essentially, AP2 doesn’t process the payment itself; it augments the payment request with cryptographic proof of user intent. This allows payment providers to treat agent-initiated transactions with appropriate caution or speed (for example, an issuer might approve an unusual-looking purchase if it sees a valid AP2 mandate proving the user pre-approved it). Notably, Google and partners plan to evolve AP2 to support “push” payment methods as well – such as real-time bank transfers (like India’s UPI or Brazil’s PIX systems) – and other emerging digital payment types. This indicates AP2’s integration will expand beyond cards, aligning with modern payment trends worldwide.

For merchants and payment processors, integrating AP2 would mean supporting the additional protocol messages (mandates) and verifying signatures. Many large payment platforms are already involved in shaping AP2, so we can expect they will build support for it. For example, companies like Adyen, Worldpay, Paypal, Stripe (not explicitly named in the blog but likely interested), and others could incorporate AP2 into their checkout APIs or SDKs, allowing an agent to initiate a payment in a standardized way. Because AP2 is an open specification on GitHub with reference implementations, payment providers and tech platforms can start experimenting with it immediately. Google has also mentioned an AI Agent Marketplace where third-party agents can be listed – these agents are expected to support AP2 for any transactional capabilities. In practice, an enterprise that builds an AI sales assistant or procurement agent could list it on this marketplace, and thanks to AP2, that agent can carry out purchases or orders reliably.

Finally, AP2’s integration story benefits from its broad industry backing. By co-developing the protocol with major financial institutions and tech firms, Google ensured AP2 aligns with existing industry rules and compliance requirements. The collaboration with payment networks (e.g. Mastercard, UnionPay), issuers (e.g. American Express), fintechs (e.g. Revolut, Paypal), e-commerce players (e.g. Etsy), and even identity/security providers (e.g. Okta, Cloudflare) suggests AP2 is being designed to slot into real-world systems with minimal friction. These stakeholders bring expertise in areas like KYC (Know Your Customer regulations), fraud prevention, and data privacy, helping AP2 address those needs out of the box. In summary, AP2 is built to be agent-friendly and payment-provider-friendly: it extends existing AI agent protocols to handle transactions, and it layers on top of existing payment networks to utilize their infrastructure while adding necessary trust guarantees.

Security, Compliance, and Interoperability Considerations

Security and trust are at the heart of AP2’s design. The protocol’s use of cryptography (digital signatures on mandates) ensures that every critical action in an agentic transaction is verifiable and traceable. This non-repudiation is crucial: neither the user nor merchant can later deny what was authorized and agreed upon, since the mandates serve as secure records. A direct benefit is in fraud prevention and dispute resolution – with AP2, if a malicious or buggy agent attempts an unauthorized purchase, the lack of a valid user-signed mandate would be evident, and the transaction can be declined or reversed. Conversely, if a user claims “I never approved this purchase,” but a Cart Mandate exists with their cryptographic signature, the merchant and issuer have strong evidence to support the charge. This clarity of accountability answers a major compliance concern for the payments industry.

Authorization & Privacy: AP2 enforces an explicit authorization step (or steps) from the user for agent-led transactions, which aligns with regulatory trends like strong customer authentication. The User Control principle baked into AP2 means an agent cannot spend funds unless the user (or someone delegated by the user) has provided a verifiable instruction to do so. Even in fully autonomous scenarios, the user predefines the rules via an Intent Mandate. This approach can be seen as analogous to giving a power-of-attorney to the agent for specific transactions, but in a digitally signed, fine-grained manner. From a privacy perspective, AP2 is mindful about data sharing: the protocol uses a role-based data architecture to ensure that sensitive info (like payment credentials or personal details) is only shared with parties that absolutely need it. For example, an agent might send a Cart Mandate to a merchant containing item and price info, but the user’s actual card number might only be shared through the Payment Mandate with the payment processor, not with the agent or merchant. This minimizes unnecessary exposure of data, aiding compliance with privacy laws and PCI-DSS rules for handling payment data.

Compliance & Standards: Because AP2 was developed with input from established financial entities, it has been designed to meet or complement existing compliance standards in payments. The protocol doesn’t bypass the usual payment authorization flows – instead, it augments them with additional evidence and flags. This means AP2 transactions can still leverage fraud detection systems, 3-D Secure checks, or any regulatory checks required, with AP2’s mandates acting as extra authentication factors or context cues. For instance, a bank could treat a Payment Mandate akin to a customer’s digital signature on a transaction, potentially streamlining compliance with requirements for user consent. Additionally, AP2’s designers explicitly mention working “in concert with industry rules and standards”. We can infer that as AP2 evolves, it may be brought to formal standards bodies (such as the W3C, EMVCo, or ISO) to ensure it aligns with global financial standards. Google has stated commitment to an open, collaborative evolution of AP2 possibly through standards organizations. This open process will help iron out any regulatory concerns and achieve broad acceptance, similar to how previous payment standards (EMV chip cards, 3-D Secure, etc.) underwent industry-wide collaboration.

Interoperability: Avoiding fragmentation is a key goal of AP2. To that end, the protocol is openly published and made available for anyone to implement or integrate. It is not tied to Google Cloud services – in fact, AP2 is open-source (Apache-2 licensed) and the specification plus reference code is on a public GitHub repository. This encourages interoperability because multiple vendors can adopt AP2 and still have their systems work together. Already, the interoperability principle is highlighted: AP2 is an extension of existing open protocols (A2A, MCP) and is non-proprietary, meaning it fosters a competitive ecosystem of implementations rather than a single-vendor solution. In practical terms, an AI agent built by Company A could initiate a transaction with a merchant system from Company B if both follow AP2 – neither side is locked into one platform.

One possible concern is ensuring consistent adoption: if some major players chose a different protocol or closed approach, fragmentation could still occur. However, given the broad coalition behind AP2, it appears poised to become a de facto standard. The inclusion of many identity and security-focused firms (for example, Okta, Cloudflare, Ping Identity) in the AP2 ecosystem Figure: Over 60 companies across finance, tech, and crypto are collaborating on AP2 (partial list of partners). suggests that interoperability and security are being jointly addressed. These partners can help integrate AP2 into identity verification workflows and fraud prevention tools, ensuring that an AP2 transaction can be trusted across systems.

From a technology standpoint, AP2’s use of widely accepted cryptographic techniques (likely JSON-LD or JWT-based verifiable credentials, public key signatures, etc.) makes it compatible with existing security infrastructure. Organizations can use their existing PKI (Public Key Infrastructure) to manage keys for signing mandates. AP2 also seems to anticipate integration with decentralized identity systems: Google mentions that AP2 creates opportunities to innovate in areas like decentralized identity for agent authorization. This means in the future, AP2 could leverage DID (Decentralized Identifier) standards or decentralized identifier verification for identifying agents and users in a trusted way. Such an approach would further enhance interoperability by not relying on any single identity provider. In summary, AP2 emphasizes security through cryptography and clear accountability, aims to be compliance-ready by design, and promotes interoperability through its open standard nature and broad industry support.

Comparison with Existing Protocols

AP2 is a novel protocol addressing a gap that existing payment and agent frameworks have not covered: enabling autonomous agents to perform payments in a secure, standardized manner. In terms of agent communication protocols, AP2 builds on prior work like the Agent2Agent (A2A) protocol. A2A (open-sourced earlier in 2025) allows different AI agents to talk to each other regardless of their underlying frameworks. However, A2A by itself doesn’t define how agents should conduct transactions or payments – it’s more about task negotiation and data exchange. AP2 extends this landscape by adding a transaction layer that any agent can use when a conversation leads to a purchase. In essence, AP2 can be seen as complementary to A2A and MCP, rather than overlapping: A2A covers the communication and collaboration aspects, MCP covers using external tools/APIs, and AP2 covers payments and commerce. Together, they form a stack of standards for a future “agent economy.” This modular approach is somewhat analogous to internet protocols: for example, HTTP for data communication and SSL/TLS for security – here A2A might be like the HTTP of agents, and AP2 the secure transactional layer on top for commerce.

When comparing AP2 to traditional payment protocols and standards, there are both parallels and differences. Traditional online payments (credit card checkouts, PayPal transactions, etc.) typically involve protocols like HTTPS for secure transmission, and standards like PCI DSS for handling card data, plus possibly 3-D Secure for additional user authentication. These assume a user-driven flow (user clicks and perhaps enters a one-time code). AP2, by contrast, introduces a way for a third-party (the agent) to participate in the flow without undermining security. One could compare AP2’s mandate concept to an extension of OAuth-style delegated authority, but applied to payments. In OAuth, a user can grant an application limited access to an account via tokens; similarly in AP2, a user grants an agent authority to spend under certain conditions via mandates. The key difference is that AP2’s “tokens” (mandates) are specific, signed instructions for financial transactions, which is more fine-grained than existing payment authorizations.

Another point of comparison is how AP2 relates to existing e-commerce checkout flows. For instance, many e-commerce sites use protocols like the W3C Payment Request API or platform-specific SDKs to streamline payments. Those mainly standardize how browsers or apps collect payment info from a user, whereas AP2 standardizes how an agent would prove user intent to a merchant and payment processor. AP2’s focus on verifiable intent and non-repudiation sets it apart from simpler payment APIs. It’s adding an additional layer of trust on top of the payment networks. One could say AP2 is not replacing the payment networks (Visa, ACH, blockchain, etc.), but rather augmenting them. The protocol explicitly supports all types of payment methods (even crypto), so it is more about standardizing the agent’s interaction with these systems, not creating a new payment rail from scratch.

In the realm of security and authentication protocols, AP2 shares some spirit with things like digital signatures in EMV chip cards or the notarization in digital contracts. For example, EMV chip card transactions generate cryptograms to prove the card was present; AP2 generates cryptographic proof that the user’s agent was authorized. Both aim to prevent fraud, but AP2’s scope is the agent-user relationship and agent-merchant messaging, which no existing payment standard addresses. Another emerging comparison is with account abstraction in crypto (e.g. ERC-4337) where users can authorize pre-programmed wallet actions. Crypto wallets can be set to allow certain automated transactions (like auto-paying a subscription via a smart contract), but those are typically confined to one blockchain environment. AP2, on the other hand, aims to be cross-platform – it can leverage blockchain for some payments (through its extensions) but also works with traditional banks.

There isn’t a direct “competitor” protocol to AP2 in the mainstream payments industry yet – it appears to be the first concerted effort at an open standard for AI-agent payments. Proprietary attempts may arise (or may already be in progress within individual companies), but AP2’s broad support gives it an edge in becoming the standard. It’s worth noting that IBM and others have an Agent Communication Protocol (ACP) and similar initiatives for agent interoperability, but those don’t encompass the payment aspect in the comprehensive way AP2 does. If anything, AP2 might integrate with or leverage those efforts (for example, IBM’s agent frameworks could implement AP2 for any commerce tasks).

In summary, AP2 distinguishes itself by targeting the unique intersection of AI and payments: where older payment protocols assumed a human user, AP2 assumes an AI intermediary and fills the trust gap that results. It extends, rather than conflicts with, existing payment processes, and complements existing agent protocols like A2A. Going forward, one might see AP2 being used alongside established standards – for instance, an AP2 Cart Mandate might work in tandem with a traditional payment gateway API call, or an AP2 Payment Mandate might be attached to a ISO 8583 message in banking. The open nature of AP2 also means if any alternative approaches emerge, AP2 could potentially absorb or align with them through community collaboration. At this stage, AP2 is setting a baseline that did not exist before, effectively pioneering a new layer of protocol in the AI and payments stack.

Implications for Web3 and Decentralized Systems

From the outset, AP2 has been designed to be inclusive of Web3 and cryptocurrency-based payments. The protocol recognizes that future commerce will span both traditional fiat channels and decentralized blockchain networks. As noted earlier, AP2 supports payment types ranging from credit cards and bank transfers to stablecoins and cryptocurrencies. In fact, alongside AP2’s launch, Google announced a specific extension for crypto payments called A2A x402. This extension, developed in collaboration with crypto-industry players like Coinbase, the Ethereum Foundation, and MetaMask, is a “production-ready solution for agent-based crypto payments”. The name “x402” is an homage to the HTTP 402 “Payment Required” status code, which was never widely used on the Web – AP2’s crypto extension effectively revives the spirit of HTTP 402 for decentralized agents that want to charge or pay each other on-chain. In practical terms, the x402 extension adapts AP2’s mandate concept to blockchain transactions. For example, an agent could hold a signed Intent Mandate from a user and then execute an on-chain payment (say, send a stablecoin) once conditions are met, attaching proof of the mandate to that on-chain transaction. This marries the AP2 off-chain trust framework with the trustless nature of blockchain, giving the best of both worlds: an on-chain payment that off-chain parties (users, merchants) can trust was authorized by the user.

The synergy between AP2 and Web3 is evident in the list of collaborators. Crypto exchanges (Coinbase), blockchain foundations (Ethereum Foundation), crypto wallets (MetaMask), and Web3 startups (e.g. Mysten Labs of Sui, Lightspark for Lightning Network) are involved in AP2’s development. Their participation suggests AP2 is viewed as complementary to decentralized finance rather than competitive. By creating a standard way for AI agents to interact with crypto payments, AP2 could drive more usage of crypto in AI-driven applications. For instance, an AI agent might use AP2 to seamlessly swap between paying with a credit card or paying with a stablecoin, depending on user preference or merchant acceptance. The A2A x402 extension specifically allows agents to monetize or pay for services through on-chain means, which could be crucial in decentralized marketplaces of the future. It hints at agents possibly running as autonomous economic actors on blockchain (a concept some refer to as DACs or DAOs) being able to handle payments required for services (like paying a small fee to another agent for information). AP2 could provide the lingua franca for such transactions, ensuring even on a decentralized network, the agent has a provable mandate for what it’s doing.

In terms of competition, one could ask: do purely decentralized solutions make AP2 unnecessary, or vice-versa? It’s likely that AP2 will coexist with Web3 solutions in a layered approach. Decentralized finance offers trustless execution (smart contracts, etc.), but it doesn’t inherently solve the problem of “Did an AI have permission from a human to do this?”. AP2 addresses that very human-to-AI trust link, which remains important even if the payment itself is on-chain. Rather than competing with blockchain protocols, AP2 can be seen as bridging them with the off-chain world. For example, a smart contract might accept a certain transaction only if it includes a reference to a valid AP2 mandate signature – something that could be implemented to combine off-chain intent proof with on-chain enforcement. Conversely, if there are crypto-native agent frameworks (some blockchain projects explore autonomous agents that operate with crypto funds), they might develop their own methods for authorization. AP2’s broad industry support, however, might steer even those projects to adopt or integrate with AP2 for consistency.

Another angle is decentralized identity and credentials. AP2’s use of verifiable credentials is very much in line with Web3’s approach to identity (e.g. DIDs and VCs as standardized by W3C). This means AP2 could plug into decentralized identity systems – for instance, a user’s DID could be used to sign an AP2 mandate, which a merchant could verify against a blockchain or identity hub. The mention of exploring decentralized identity for agent authorization reinforces that AP2 may leverage Web3 identity innovations for verifying agent and user identities in a decentralized way, rather than relying only on centralized authorities. This is a point of synergy, as both AP2 and Web3 aim to give users more control and cryptographic proof of their actions.

Potential conflicts might arise only if one envisions a fully decentralized commerce ecosystem with no role for large intermediaries – in that scenario, could AP2 (initially pushed by Google and partners) be too centralized or governed by traditional players? It’s important to note AP2 is open source and intended to be standardizable, so it’s not proprietary to Google. This makes it more palatable to the Web3 community, which values open protocols. If AP2 becomes widely adopted, it might reduce the need for separate Web3-specific payment protocols for agents, thereby unifying efforts. On the other hand, some blockchain projects might prefer purely on-chain authorization mechanisms (like multi-signature wallets or on-chain escrow logic) for agent transactions, especially in trustless environments without any centralized authorities. Those could be seen as alternative approaches, but they likely would remain niche unless they can interact with off-chain systems. AP2, by covering both worlds, might actually accelerate Web3 adoption by making crypto just another payment method an AI agent can use seamlessly. Indeed, one partner noted that “stablecoins provide an obvious solution to scaling challenges [for] agentic systems with legacy infrastructure”, highlighting that crypto can complement AP2 in handling scale or cross-border scenarios. Meanwhile, Coinbase’s engineering lead remarked that bringing the x402 crypto extension into AP2 “made sense – it’s a natural playground for agents... exciting to see agents paying each other resonate with the AI community”. This implies a vision where AI agents transacting via crypto networks is not just a theoretical idea but an expected outcome, with AP2 acting as a catalyst.

In summary, AP2 is highly relevant to Web3: it incorporates crypto payments as a first-class citizen and is aligning with decentralized identity and credential standards. Rather than competing head-on with decentralized payment protocols, AP2 likely interoperates with them – providing the authorization layer while the decentralized systems handle the value transfer. As the line between traditional finance and crypto blurs (with stablecoins, CBDCs, etc.), a unified protocol like AP2 could serve as a universal adapter between AI agents and any form of money, centralized or decentralized.

Industry Adoption, Partnerships, and Roadmap

One of AP2’s greatest strengths is the extensive industry backing behind it, even at this early stage. Google Cloud announced that it is “collaborating with a diverse group of more than 60 organizations” on AP2. These include major credit card networks (e.g. Mastercard, American Express, JCB, UnionPay), leading fintech and payment processors (PayPal, Worldpay, Adyen, Checkout.com, Stripe’s competitors), e-commerce and online marketplaces (Etsy, Shopify (via partners like Stripe or others), Lazada, Zalora), enterprise tech companies (Salesforce, ServiceNow, Oracle possibly via partners, Dell, Red Hat), identity and security firms (Okta, Ping Identity, Cloudflare), consulting firms (Deloitte, Accenture), and crypto/Web3 organizations (Coinbase, Ethereum Foundation, MetaMask, Mysten Labs, Lightspark), among others. Such a wide array of participants is a strong indicator of industry interest and likely adoption. Many of these partners have publicly voiced support. For example, Adyen’s Co-CEO highlighted the need for a “common rulebook” for agentic commerce and sees AP2 as a natural extension of their mission to support merchants with new payment building blocks. American Express’s EVP stated that AP2 is important for “the next generation of digital payments” where trust and accountability are paramount. Coinbase’s team, as noted, is excited about integrating crypto payments into AP2. This chorus of support shows that many in the industry view AP2 as the likely standard for AI-driven payments, and they are keen to shape it to ensure it meets their requirements.

From an adoption standpoint, AP2 is currently at the specification and early implementation stage (announced in September 2025). The complete technical spec, documentation, and some reference implementations (in languages like Python) are available on the project’s GitHub for developers to experiment with. Google has also indicated that AP2 will be incorporated into its products and services for agents. A notable example is the AI Agent Marketplace mentioned earlier: this is a platform where third-party AI agents can be offered to users (likely part of Google’s generative AI ecosystem). Google says many partners building agents will make them available in the marketplace with “new, transactable experiences enabled by AP2”. This implies that as the marketplace launches or grows, AP2 will be the backbone for any agent that needs to perform a transaction, whether it’s buying software from the Google Cloud Marketplace autonomously or an agent purchasing goods/services for a user. Enterprise use cases like autonomous procurement (one agent buying from another on behalf of a company) and automatic license scaling have been specifically mentioned as areas AP2 could facilitate soon.

In terms of a roadmap, the AP2 documentation and Google’s announcement give some clear indications:

  • Near-term: Continue open development of the protocol with community input. The GitHub repo will be updated with additional reference implementations and improvements as real-world testing happens. We can expect libraries/SDKs to emerge, making it easier to integrate AP2 into agent applications. Also, initial pilot programs or proofs-of-concept might be conducted by the partner companies. Given that many large payment companies are involved, they might trial AP2 in controlled environments (e.g., an AP2-enabled checkout option in a small user beta).
  • Standards and Governance: Google has expressed a commitment to move AP2 into an open governance model, possibly via standards bodies. This could mean submitting AP2 to organizations like the Linux Foundation (as was done with the A2A protocol) or forming a consortium to maintain it. The Linux Foundation, W3C, or even bodies like ISO/TC68 (financial services) might be in the cards for formalizing AP2. An open governance would reassure the industry that AP2 is not under single-company control and will remain neutral and inclusive.
  • Feature Expansion: Technically, the roadmap includes expanding support to more payment types and use cases. As noted in the spec, after cards, the focus will shift to “push” payments like bank wires and local real-time payment schemes, and digital currencies. This means AP2 will outline how an Intent/Cart/Payment Mandate works for, say, a direct bank transfer or a crypto wallet transfer, where the flow is a bit different than card pulls. The A2A x402 extension is one such expansion for crypto; similarly, we might see an extension for open banking APIs or one for B2B invoicing scenarios.
  • Security & Compliance Enhancements: As real transactions start flowing through AP2, there will be scrutiny from regulators and security researchers. The open process will likely iterate on making mandates even more robust (e.g., ensuring mandate formats are standardized, possibly using W3C Verifiable Credentials format, etc.). Integration with identity solutions (perhaps leveraging biometrics for user signing of mandates, or linking mandates to digital identity wallets) could be part of the roadmap to enhance trust.
  • Ecosystem Tools: An emerging ecosystem is likely. Already, startups are noticing gaps – for instance, the Vellum.ai analysis mentions a startup called Autumn building “billing infrastructure for AI,” essentially tooling on top of Stripe to handle complex pricing for AI services. As AP2 gains traction, we can expect more tools like agent-focused payment gateways, mandate management dashboards, agent identity verification services, etc., to appear. Google’s involvement means AP2 could also be integrated into its Cloud products – imagine AP2 support in Dialogflow or Vertex AI Agents tooling, making it one-click to enable an agent to handle transactions (with all the necessary keys and certificates managed in Google Cloud).

Overall, the trajectory of AP2 is reminiscent of other major industry standards: an initial launch with a strong sponsor (Google), broad industry coalition, open-source reference code, followed by iterative improvement and gradual adoption in real products. The fact that AP2 is inviting all players “to build this future with us” underscores that the roadmap is about collaboration. If the momentum continues, AP2 could become as commonplace in a few years as protocols like OAuth or OpenID Connect are today in their domains – an unseen but critical layer enabling functionality across services.

Conclusion

AP2 (Agents/Agent Payments Protocol) represents a significant step toward a future where AI agents can transact as reliably and securely as humans. Technically, it introduces a clever mechanism of verifiable mandates and credentials that instill trust in agent-led transactions, ensuring user intent is explicit and enforceable. Its open, extensible architecture allows it to integrate both with the burgeoning AI agent frameworks and the established financial infrastructure. By addressing core concerns of authorization, authenticity, and accountability, AP2 lays the groundwork for AI-driven commerce to flourish without sacrificing security or user control.

The introduction of AP2 can be seen as laying a new foundation – much like early internet protocols enabled the web – for what some call the “agent economy.” It paves the way for countless innovations: personal shopper agents, automatic deal-finding bots, autonomous supply chain agents, and more, all operating under a common trust framework. Importantly, AP2’s inclusive design (embracing everything from credit cards to crypto) positions it at the intersection of traditional finance and Web3, potentially bridging these worlds through a common agent-mediated protocol.

Industry response so far has been very positive, with a broad coalition signaling that AP2 is likely to become a widely adopted standard. The success of AP2 will depend on continued collaboration and real-world testing, but its prospects are strong given the clear need it addresses. In a broader sense, AP2 exemplifies how technology evolves: a new capability (AI agents) emerged that broke old assumptions, and the solution was to develop a new open standard to accommodate that capability. By investing in an open, security-first protocol now, Google and its partners are effectively building the trust architecture required for the next era of commerce. As the saying goes, “the best way to predict the future is to build it” – AP2 is a bet on a future where AI agents seamlessly handle transactions for us, and it is actively constructing the trust and rules needed to make that future viable.

Sources:

  • Google Cloud Blog – “Powering AI commerce with the new Agent Payments Protocol (AP2)” (Sept 16, 2025)
  • AP2 GitHub Documentation – “Agent Payments Protocol Specification and Overview”
  • Vellum AI Blog – “Google’s AP2: A new protocol for AI agent payments” (Analysis)
  • Medium Article – “Google Agent Payments Protocol (AP2)” (Summary by Tahir, Sept 2025)
  • Partner Quotes on AP2 (Google Cloud Blog)
  • A2A x402 Extension (AP2 crypto payments extension) – GitHub README

Digital Asset Custody for Low‑Latency, Secure Trade Execution at Scale

· 10 min read
Dora Noda
Software Engineer

How to design a custody and execution stack that moves at market speed without compromising on risk, audit, or compliance.


Executive Summary

Custody and trading can no longer operate in separate worlds. In today's digital asset markets, holding client assets securely is only half the battle. If you can’t execute trades in milliseconds when prices move, you are leaving returns on the table and exposing clients to avoidable risks like Maximal Extractable Value (MEV), counterparty failures, and operational bottlenecks. A modern custody and execution stack must blend cutting-edge security with high-performance engineering. This means integrating technologies like Multi-Party Computation (MPC) and Hardware Security Modules (HSMs) for signing, using policy engines and private transaction routing to mitigate front-running, and leveraging active/active infrastructure with off-exchange settlement to reduce venue risk and boost capital efficiency. Critically, compliance can't be a bolt-on; features like Travel Rule data flows, immutable audit logs, and controls mapped to frameworks like SOC 2 must be built directly into the transaction pipeline.


Why “Custody Speed” Matters Now

Historically, digital asset custodians optimized for one primary goal: don’t lose the keys. While that remains fundamental, the demands have evolved. Today, best execution and market integrity are equally non-negotiable. When your trades travel through public mempools, sophisticated actors can see them, reorder them, or "sandwich" them to extract profit at your expense. This is MEV in action, and it directly impacts execution quality. Keeping sensitive order flow out of public view by using private transaction relays is a powerful way to reduce this exposure.

At the same time, venue risk is a persistent concern. Concentrating large balances on a single exchange creates significant counterparty risk. Off-exchange settlement networks provide a solution, allowing firms to trade with exchange-provided credit while their assets remain in segregated, bankruptcy-remote custody. This model vastly improves both safety and capital efficiency.

Regulators are also closing the gaps. The enforcement of the Financial Action Task Force (FATF) Travel Rule and recommendations from bodies like IOSCO and the Financial Stability Board are pushing digital asset markets toward a "same-risk, same-rules" framework. This means custody platforms must be built from the ground up with compliant data flows and auditable controls.


Design Goals (What “Good” Looks Like)

A high-performance custody stack should be built around a few core design principles:

  • Latency you can budget: Every millisecond from client intent to network broadcast must be measured, managed, and enforced with strict Service Level Objectives (SLOs).
  • MEV-resilient execution: Sensitive orders should be routed through private channels by default. Exposure to the public mempool should be an intentional choice, not an unavoidable default.
  • Key material with real guarantees: Private keys must never leave their protected boundaries, whether they are distributed across MPC shards, stored in HSMs, or isolated in Trusted Execution Environments (TEEs). Key rotation, quorum enforcement, and robust recovery procedures are table stakes.
  • Active/active reliability: The system must be resilient to failure. This requires multi-region and multi-provider redundancy for both RPC nodes and signers, complemented by automated circuit breakers and kill-switches for venue and network incidents.
  • Compliance-by-construction: Compliance cannot be an afterthought. The architecture must have built-in hooks for Travel Rule data, AML/KYT checks, and immutable audit trails, with all controls mapped directly to recognized frameworks like the SOC 2 Trust Services Criteria.

A Reference Architecture

This diagram illustrates a high-level architecture for a custody and execution platform that meets these goals.

  • The Policy & Risk Engine is the central gatekeeper for every instruction. It evaluates everything—Travel Rule payloads, velocity limits, address risk scores, and signer quorum requirements—before any key material is accessed.
  • The Signer Orchestrator intelligently routes signing requests to the most appropriate control plane for the asset and policy. This could be:
    • MPC (Multi-Party Computation) using threshold signature schemes (like t-of-n ECDSA/EdDSA) to distribute trust across multiple parties or devices.
    • HSMs (Hardware Security Modules) for hardware-enforced key custody with deterministic backup and rotation policies.
    • Trusted Execution Environments (e.g., AWS Nitro Enclaves) to isolate signing code and bind keys directly to attested, measured software.
  • The Execution Router sends transactions on the optimal path. It prefers private transaction submission for large or information-sensitive orders to avoid front-running. It falls back to public submission when needed, using multi-provider RPC failover to maintain high availability even during network brownouts.
  • The Observability Layer provides a real-time view of the system's state. It watches the mempool and new blocks via subscriptions, reconciles executed trades against internal records, and commits immutable audit records for every decision, signature, and broadcast.

Security Building Blocks (and Why They Matter)

  • Threshold Signatures (MPC): This technology distributes control over a private key so that no single machine—or person—can unilaterally move funds. Modern MPC protocols can implement fast, maliciously secure signing that is suitable for production latency budgets.
  • HSMs and FIPS Alignment: HSMs enforce key boundaries with tamper-resistant hardware and documented security policies. Aligning with standards like FIPS 140-3 and NIST SP 800-57 provides auditable, widely understood security guarantees.
  • Attested TEEs: Trusted Execution Environments bind keys to specific, measured code running in isolated enclaves. Using a Key Management Service (KMS), you can create policies that only release key material to these attested workloads, ensuring that only approved code can sign.
  • Private Relays for MEV Protection: These services allow you to ship sensitive transactions directly to block builders or validators, bypassing the public mempool. This dramatically reduces the risk of front-running and other forms of MEV.
  • Off-Exchange Settlement: This model allows you to hold collateral in segregated custody while trading on centralized venues. It limits counterparty exposure, accelerates net settlement, and frees up capital.
  • Controls Mapped to SOC 2/ISO: Documenting and testing your operational controls against recognized frameworks allows customers, auditors, and partners to trust—and independently verify—your security and compliance posture.

Latency Playbook: Where the Milliseconds Go

To achieve low-latency execution, you need to optimize every step of the transaction lifecycle:

  • Intent → Policy Decision: Keep policy evaluation logic hot in memory. Cache Know-Your-Transaction (KYT) and allowlist data with short, bounded Time-to-Live (TTL) values, and pre-compute signer quorums where possible.
  • Signing: Use persistent MPC sessions and HSM key handles to avoid the overhead of cold starts. For TEEs, pin the enclaves, warm their attestation paths, and reuse session keys where it is safe to do so.
  • Broadcast: Prefer persistent WebSocket connections to RPC nodes over HTTP. Co-locate your execution services with your primary RPC providers' regions. When latency spikes, retry idempotently and hedge broadcasts across multiple providers.
  • Confirmation: Instead of polling for transaction status, subscribe to receipts and events directly from the network. Stream these state changes into a reconciliation pipeline for immediate user feedback and internal bookkeeping.

Set strict SLOs for each hop (e.g., policy check <20ms, signing <50–100ms, broadcast <50ms under normal load) and enforce them with error budgets and automated failover when p95 or p99 latencies degrade.


Risk & Compliance by Design

A modern custody stack must treat compliance as an integral part of the system, not an add-on.

  • Travel Rule Orchestration: Generate and validate originator and beneficiary data in-line with every transfer instruction. Automatically block or detour transactions involving unknown Virtual Asset Service Providers (VASPs) and log cryptographic receipts of every data exchange for audit purposes.
  • Address Risk & Allowlists: Integrate on-chain analytics and sanctions screening lists directly into the policy engine. Enforce a deny-by-default posture, where transfers are only permitted to explicitly allowlisted addresses or under specific policy exceptions.
  • Immutable Audit: Hash every request, approval, signature, and broadcast into an append-only ledger. This creates a tamper-evident audit trail that can be streamed to a SIEM for real-time threat detection and provided to auditors for control testing.
  • Control Framework: Map every technical and operational control to the SOC 2 Trust Services Criteria (Security, Availability, Processing Integrity, Confidentiality, and Privacy) and implement a program of continuous testing and validation.

Off-Exchange Settlement: Safer Venue Connectivity

A custody stack built for institutional scale should actively minimize exposure to exchanges. Off-exchange settlement networks are a key enabler of this. They allow a firm to maintain assets in its own segregated custody while an exchange mirrors that collateral to enable instant trading. Final settlement occurs on a fixed cadence with Delivery versus Payment (DvP)-like guarantees.

This design dramatically reduces the "hot wallet" footprint and the associated counterparty risk, all while preserving the speed required for active trading. It also improves capital efficiency, as you no longer need to overfund idle balances across multiple venues, and it simplifies operational risk management by keeping collateral segregated and fully auditable.


Control Checklist (Copy/Paste Into Your Runbook)

  • Key Custody
    • MPC using a t-of-n threshold across independent trust domains (e.g., multi-cloud, on-prem, HSMs).
    • Use FIPS-validated modules where feasible; maintain plans for quarterly key rotation and incident-driven rekeying.
  • Policy & Approvals
    • Implement a dynamic policy engine with velocity limits, behavioral heuristics, and business-hour constraints.
    • Require four-eyes approval for high-risk operations.
    • Enforce address allowlists and Travel Rule checks before any signing operation.
  • Execution Hardening
    • Use private transaction relays by default for large or sensitive orders.
    • Utilize dual RPC providers with health-based hedging and robust replay protection.
  • Monitoring & Response
    • Implement real-time anomaly detection on intent rates, gas price outliers, and failed transaction inclusion.
    • Maintain a one-click kill-switch to freeze all signers on a per-asset or per-venue basis.
  • Compliance & Audit
    • Maintain an immutable event log for all system actions.
    • Perform continuous, SOC 2-aligned control testing.
    • Ensure robust retention of all Travel Rule evidence.

Implementation Notes

  • People & Process First: Technology cannot fix ambiguous authorization policies or unclear on-call ownership. Clearly define who is authorized to change policy, promote signer code, rotate keys, and approve exceptions.
  • Minimize Complexity Where You Can: Every new blockchain, bridge, or venue you integrate adds non-linear operational risk. Add them deliberately, with clear test coverage, monitoring, and roll-back plans.
  • Test Like an Adversary: Regularly conduct chaos engineering drills. Simulate signer loss, enclave attestation failures, stalled mempools, venue API throttling, and malformed Travel Rule data to ensure your system is resilient.
  • Prove It: Track the KPIs that your customers actually care about:
    • Time-to-broadcast and time-to-first-confirmation (p95/p99).
    • The percentage of transactions submitted via MEV-safe routes versus the public mempool.
    • Venue utilization and collateral efficiency gains from using off-exchange settlement.
    • Control effectiveness metrics, such as the percentage of transfers with complete Travel Rule data attached and the rate at which audit findings are closed.

The Bottom Line

A custody platform worthy of institutional flow executes fast, proves its controls, and limits counterparty and information risk—all at the same time. This requires a deeply integrated stack built on MEV-aware routing, hardware-anchored or MPC-based signing, active/active infrastructure, and off-exchange settlement that keeps assets safe while accessing global liquidity. By building these components into a single, measured pipeline, you deliver the one thing institutional clients value most: certainty at speed.

Cross-Chain Messaging and Shared Liquidity: Security Models of LayerZero v2, Hyperlane, and IBC 3.0

· 50 min read
Dora Noda
Software Engineer

Interoperability protocols like LayerZero v2, Hyperlane, and IBC 3.0 are emerging as critical infrastructure for a multi-chain DeFi ecosystem. Each takes a different approach to cross-chain messaging and shared liquidity, with distinct security models:

  • LayerZero v2 – a proof aggregation model using Decentralized Verifier Networks (DVNs)
  • Hyperlane – a modular framework often using a multisig validator committee
  • IBC 3.0 – a light client protocol with trust-minimized relayers in the Cosmos ecosystem

This report analyzes the security mechanisms of each protocol, compares the pros and cons of light clients vs. multisigs vs. proof aggregation, and examines their impact on DeFi composability and liquidity. We also review current implementations, threat models, and adoption levels, concluding with an outlook on how these design choices affect the long-term viability of multi-chain DeFi.

Security Mechanisms of Leading Cross-Chain Protocols

LayerZero v2: Proof Aggregation with Decentralized Verifier Networks (DVNs)

LayerZero v2 is an omnichain messaging protocol that emphasizes a modular, application-configurable security layer. The core idea is to let applications secure messages with one or more independent Decentralized Verifier Networks (DVNs), which collectively attest to cross-chain messages. In LayerZero’s proof aggregation model, each DVN is essentially a set of verifiers that can independently validate a message (e.g. by checking a block proof or signature). An application can require aggregated proofs from multiple DVNs before accepting a message, forming a threshold “security stack.”

By default, LayerZero provides some DVNs out-of-the-box – for example, a LayerZero Labs-operated DVN that uses a 2-of-3 multisig validation, and a DVN run by Google Cloud. But crucially, developers can mix and match DVNs: e.g. one might require a “1 of 3 of 5” configuration meaning a specific DVN must sign plus any 2 out of 5 others. This flexibility allows combining different verification methods (light clients, zkProofs, oracles, etc.) in one aggregated proof. In effect, LayerZero v2 generalizes the Ultra Light Node model of v1 (which relied on one Relayer + one Oracle) into an X-of-Y-of-N multisig aggregation across DVNs. An application’s LayerZero Endpoint contract on each chain will only deliver a message if the required DVN quorum has written valid attestations for that message.

Security characteristics: LayerZero’s approach is trust-minimized to the extent that at least one DVN in the required set is honest (or one zk-proof is valid, etc.). By letting apps run their own DVN as a required signer, LayerZero even allows an app to veto any message unless approved by the app team’s verifier. This can significantly harden security (at the cost of centralization), ensuring no cross-chain message executes without the app’s signature. On the other hand, developers may choose a more decentralized DVN quorum (e.g. 5 of 15 independent networks) for stronger trust distribution. LayerZero calls this “application-owned security”: each app chooses the trade-off between security, cost, and performance by configuring its DVNs. All DVN attestations are ultimately verified on-chain by immutable LayerZero Endpoint contracts, preserving a permissionless transport layer. The downside is that security is only as strong as the DVNs chosen – if the configured DVNs collude or are compromised, they could approve a fraudulent cross-chain message. Thus, the burden is on each application to select robust DVNs or risk weaker security.

Hyperlane: Multisig Validator Model with Modular ISMs

Hyperlane is an interoperability framework centered on an on-chain Interchain Security Module (ISM) that verifies messages before they’re delivered on the target chain. In the simplest (and default) configuration, Hyperlane’s ISM uses a multisignature validator set: a committee of off-chain validators signs attestations (often a Merkle root of all outgoing messages) from the source chain, and a threshold of signatures is required on the destination. In other words, Hyperlane relies on a permissioned validator quorum to confirm that “message X was indeed emitted on chain A,” analogous to a blockchain’s consensus but at the bridge level. For example, Wormhole uses 19 guardians with a 13-of-19 multisig – Hyperlane’s approach is similar in spirit (though Hyperlane is distinct from Wormhole).

A key feature is that Hyperlane does not have a single enshrined validator set at the protocol level. Instead, anyone can run a validator, and different applications can deploy ISM contracts with different validator lists and thresholds. The Hyperlane protocol provides default ISM deployments (with a set of validators that the team bootstrapped), but developers are free to customize the validator set or even the security model for their app. In fact, Hyperlane supports multiple types of ISMs, including an Aggregation ISM that combines multiple verification methods, and a Routing ISM that picks an ISM based on message parameters. For instance, an app could require a Hyperlane multisig and an external bridge (like Wormhole or Axelar) both to sign off – achieving a higher security bar via redundancy.

Security characteristics: The base security of Hyperlane’s multisig model comes from the honesty of a majority of its validators. If the threshold (e.g. 5 of 8) of validators collude, they could sign a fraudulent message, so the trust assumption is roughly N-of-M multisig trust. Hyperlane is addressing this risk by integrating with EigenLayer restaking, creating an Economic Security Module (ESM) that requires validators to put up staked ETH which can be slashed for misbehavior. This “Actively Validated Service (AVS)” means if a Hyperlane validator signs an invalid message (one not actually in the source chain’s history), anyone can present proof on Ethereum to slash that validator’s stake. This significantly strengthens the security model by economically disincentivizing fraud – Hyperlane’s cross-chain messages become secured by Ethereum’s economic weight, not just by social reputation of validators. However, one trade-off is that relying on Ethereum for slashing introduces dependency on Ethereum’s liveness and assumes fraud proofs are feasible to submit in time. In terms of liveness, Hyperlane warns that if not enough validators are online to meet the threshold, message delivery can halt. The protocol mitigates this by allowing a flexible threshold configuration – e.g. using a larger validator set so occasional downtime doesn’t stall the network. Overall, Hyperlane’s modular multisig approach provides flexibility and upgradeability (apps choose their own security or combine multiple sources) at the cost of adding trust in a validator set. This is a weaker trust model than a true light client, but with recent innovations (like restaked collateral and slashing) it can approach similar security guarantees in practice while remaining easier to deploy across many chains.

IBC 3.0: Light Clients with Trust-Minimized Relayers

The Inter-Blockchain Communication (IBC) protocol, widely used in the Cosmos ecosystem, takes a fundamentally different approach: it uses on-chain light clients to verify cross-chain state, rather than introducing a new validator set. In IBC, each pair of chains establishes a connection where Chain B holds a light client of Chain A (and vice versa). This light client is essentially a simplified replica of the other chain’s consensus (e.g. tracking validator set signatures or block hashes). When Chain A sends a message (an IBC packet) to Chain B, a relayer (an off-chain actor) carries a proof (Merkle proof of the packet and the latest block header) to Chain B. Chain B’s IBC module then uses the on-chain light client to verify that the proof is valid under Chain A’s consensus rules. If the proof checks out (i.e. the packet was committed in a finalized block on A), the message is accepted and delivered to the target module on B. In essence, Chain B trusts Chain A’s consensus directly, not an intermediary – this is why IBC is often called trust-minimized interoperability.

IBC 3.0 refers to the latest evolution of this protocol (circa 2025), which introduces performance and feature upgrades: parallel relaying for lower latency, custom channel types for specialized use cases, and Interchain Queries for reading remote state. Notably, none of these change the core light-client security model – they enhance speed and functionality. For example, parallel relaying means multiple relayers can ferry packets simultaneously to avoid bottlenecks, improving liveness without sacrificing security. Interchain Queries (ICQ) let a contract on Chain A ask Chain B for data (with a proof), which is then verified by A’s light client of B. This extends IBC’s capabilities beyond token transfers to more general cross-chain data access, still underpinned by verified light-client proofs.

Security characteristics: IBC’s security guarantee is as strong as the source chain’s integrity. If Chain A has honest majority (or the configured consensus threshold) and Chain B’s light client of A is up-to-date, then any accepted packet must have come from a valid block on A. There is no need to trust any bridge validators or oracles – the only trust assumptions are the native consensus of the two chains and some parameters like the light client’s trusting period (after which old headers expire). Relayers in IBC do not have to be trusted; they can’t forge valid headers or packets because those would fail verification. At worst, a malicious or offline relayer can censor or delay messages, but anyone can run a relayer, so liveness is eventually achieved if at least one honest relayer exists. This is a very strong security model: effectively decentralized and permissionless by default, mirroring the properties of the chains themselves. The trade-offs come in cost and complexity – running a light client (especially of a high-throughput chain) on another chain can be resource-intensive (storing validator set changes, verifying signatures, etc.). For Cosmos SDK chains using Tendermint/BFT, this cost is manageable and IBC is very efficient; but integrating heterogeneous chains (like Ethereum or Solana) requires complex client implementations or new cryptography. Indeed, bridging non-Cosmos chains via IBC has been slower — projects like Polymer and Composable are working on light clients or zk-proofs to extend IBC to Ethereum and others. IBC 3.0’s improvements (e.g. optimized light clients, support for different verification methods) aim to reduce these costs. In summary, IBC’s light client model offers the strongest trust guarantees (no external validators at all) and solid liveness (given multiple relayers), at the expense of higher implementation complexity and limitations that all participant chains must support the IBC protocol.

Comparing Light Clients, Multisigs, and Proof Aggregation

Each security model – light clients (IBC), validator multisigs (Hyperlane), and aggregated proofs (LayerZero) – comes with distinct pros and cons. Below we compare them across key dimensions:

Security Guarantees

  • Light Clients (IBC): Offers highest security by anchoring on-chain verification to the source chain’s consensus. There’s no new trust layer; if you trust the source blockchain (e.g. Cosmos Hub or Ethereum) not to double-produce blocks, you trust the messages it sends. This minimizes additional trust assumptions and attack surface. However, if the source chain’s validator set is corrupted (e.g. >⅓ in Tendermint or >½ in a PoS chain go rogue), the light client can be fed a fraudulent header. In practice, IBC channels are usually established between economically secure chains, and light clients can have parameters (like trusting period and block finality requirements) to mitigate risks. Overall, trust-minimization is the strongest advantage of the light client model – there is cryptographic proof of validity for each message.

  • Multisig Validators (Hyperlane & similar bridges): Security hinges on the honesty of a set of off-chain signers. A typical threshold (e.g. ⅔ of validators) must sign off on each cross-chain message or state checkpoint. The upside is that this can be made reasonably secure with enough reputable or economically staked validators. For example, Wormhole’s 19 guardians or Hyperlane’s default committee collectively have to collude to compromise the system. The downside is this introduces a new trust assumption: users must trust the bridge’s committee in addition to the chains. This has proven to be a point of failure in some hacks (e.g. if private keys are stolen or if insiders collude). Initiatives like Hyperlane’s restaked ETH collateral add economic security to this model – validators who sign invalid data can be automatically slashed on Ethereum. This moves multisig bridges closer to the security of a blockchain (by financially punishing fraud), but it’s still not as trust-minimized as a light client. In short, multisigs are weaker in trust guarantees: one relies on a majority of a small group, though slashing and audits can bolster confidence.

  • Proof Aggregation (LayerZero v2): This is somewhat a middle ground. If an application configures its Security Stack to include a light client DVN or a zk-proof DVN, then the guarantee can approach IBC-level (math and chain consensus) for those checks. If it uses a committee-based DVN (like LayerZero’s 2-of-3 default or an Axelar adapter), then it inherits that multisig’s trust assumptions. The strength of LayerZero’s model is that you can combine multiple verifiers independently. For example, requiring both “a zk-proof is valid” and “Chainlink oracle says the block header is X” and “our own validator signs off” could dramatically reduce attack possibilities (an attacker would need to break all at once). Also, by allowing an app to mandate its own DVN, LayerZero ensures no message will execute without the app’s consent, if so configured. The weakness is that if developers choose a lax security configuration (for cheaper fees or speed), they might undermine security – e.g. using a single DVN run by an unknown party would be similar to trusting a single validator. LayerZero itself is unopinionated and leaves these choices to app developers, which means security is only as good as the chosen DVNs. In summary, proof aggregation can provide very strong security (even higher than a single light client, by requiring multiple independent proofs) but also allows weak setups if misconfigured. It’s flexible: an app can dial up security for high-value transactions (e.g. require multiple big DVNs) and dial it down for low-value ones.

Liveness and Availability

  • Light Clients (IBC): Liveness depends on relayers and the light client staying updated. The positive side is anyone can run a relayer, so the system doesn’t rely on a specific set of nodes – if one relayer stops, another can pick up the job. IBC 3.0’s parallel relaying further improves availability by not serializing all packets through one path. In practice, IBC connections have been very reliable, but there are scenarios where liveness can suffer: e.g., if no relayer posts an update for a long time, a light client could expire (e.g. if trusting period passes without renewal) and then the channel closes for safety. However, such cases are rare and mitigated by active relayer networks. Another liveness consideration: IBC packets are subject to source chain finality – e.g. waiting 1-2 blocks in Tendermint (a few seconds) is standard. Overall, IBC provides high availability as long as there is at least one active relayer, and latency is typically low (seconds) for finalized blocks. There is no concept of a quorum of validators going offline as in multisig; the blockchain’s own consensus finality is the main latency factor.

  • Multisig Validators (Hyperlane): Liveness can be a weakness if the validator set is small. For example, if a bridge has 5-of-8 multisig and 4 validators are offline or unreachable, cross-chain messaging halts because the threshold can’t be met. Hyperlane documentation notes that validator downtime can halt message delivery, depending on the threshold configured. This is partly why having a larger committee or a lower threshold (with safety trade-off) might be chosen to improve uptime. Hyperlane’s design allows deploying new validators or switching ISM if needed, but such changes might require coordination/governance. The advantage multisig bridges have is typically fast confirmation once threshold signatures are collected – no need to wait for block finality of a source chain on the destination chain, since the multisig attestation is the finality. In practice, many multisig bridges sign and relay messages within seconds. So latency can be comparable or even lower than light clients for some chains. The bottleneck is if validators are slow or geographically distributed, or if any manual steps are involved. In summary, multisig models can be highly live and low-latency most of the time, but they have a liveness risk concentrated in the validator set – if too many validators crash or a network partition occurs among them, the bridge is effectively down.

  • Proof Aggregation (LayerZero): Liveness here depends on the availability of each DVN and the relayer. A message must gather signatures/proofs from the required DVNs and then be relayed to the target chain. The nice aspect is DVNs operate independently – if one DVN (out of a set) is down and it’s not required (only part of an “M of N”), the message can still proceed as long as the threshold is met. LayerZero’s model explicitly allows configuring quorums to tolerate some DVN failures. For example, a “2 of 5” DVN set can handle 3 DVNs being offline without stopping the protocol. Additionally, because anyone can run the final Executor/Relayer role, there isn’t a single point of failure for message delivery – if the primary relayer fails, a user or another party can call the contract with the proofs (this is analogous to the permissionless relayer concept in IBC). Thus, LayerZero v2 strives for censorship-resistance and liveness by not binding the system to one middleman. However, if required DVNs are part of the security stack (say an app requires its own DVN always sign), then that DVN is a liveness dependency: if it goes offline, messages will pause until it comes back or the security policy is changed. In general, proof aggregation can be configured to be robust (with redundant DVNs and any-party relaying) such that it’s unlikely all verifiers are down at once. The trade-off is that contacting multiple DVNs might introduce a bit more latency (e.g. waiting for several signatures) compared to a single faster multisig. But those DVNs could run in parallel, and many DVNs (like an oracle network or a light client) can respond quickly. Therefore, LayerZero can achieve high liveness and low latency, but the exact performance depends on how the DVNs are set up (some might wait for a few block confirmations on source chain, etc., which could add delay for safety).

Cost and Complexity

  • Light Clients (IBC): This approach tends to be complex to implement but cheap to use once set up on compatible chains. The complexity lies in writing a correct light client implementation for each type of blockchain – essentially, you’re encoding the consensus rules of Chain A into a smart contract on Chain B. For Cosmos SDK chains with similar consensus, this was straightforward, but extending IBC beyond Cosmos has required heavy engineering (e.g. building a light client for Polkadot’s GRANDPA finality, or plans for Ethereum light clients with zk proofs). These implementations are non-trivial and must be highly secure. There’s also on-chain storage overhead: the light client needs to store recent validator set or state root info for the other chain. This can increase the state size and proof verification cost on chain. As a result, running IBC on, say, Ethereum mainnet directly (verifying Cosmos headers) would be expensive gas-wise – one reason projects like Polymer are making an Ethereum rollup to host these light clients off mainnet. Within the Cosmos ecosystem, IBC transactions are very efficient (often just a few cents worth of gas) because the light client verification (ed25519 sigs, Merkle proofs) is well-optimized at the protocol level. Using IBC is relatively low cost for users, and relayers just pay normal tx fees on destination chains (they can be incentivized with fees via ICS-29 middleware). In summary, IBC’s cost is front-loaded in development complexity, but once running, it provides a native, fee-efficient transport. The many Cosmos chains connected (100+ zones) share a common implementation, which helps manage complexity by standardization.

  • Multisig Bridges (Hyperlane/Wormhole/etc.): The implementation complexity here is often lower – the core bridging contracts mostly need to verify a set of signatures against stored public keys. This logic is simpler than a full light client. The off-chain validator software does introduce operational complexity (servers that observe chain events, maintain a Merkle tree of messages, coordinate signature collection, etc.), but this is managed by the bridge operators and kept off-chain. On-chain cost: verifying a few signatures (say 2 or 5 ECDSA signatures) is not too expensive, but it’s certainly more gas than a single threshold signature or a hash check. Some bridges use aggregated signature schemes (e.g. BLS) to reduce on-chain cost to 1 signature verification. In general, multisig verification on Ethereum or similar chains is moderately costly (each ECDSA sig check is ~3000 gas). If a bridge requires 10 signatures, that’s ~30k gas just for verification, plus any storage of a new Merkle root, etc. This is usually acceptable given cross-chain transfers are high-value operations, but it can add up. From a developer/user perspective, interacting with a multisig bridge is straightforward: you deposit or call a send function, and the rest is handled off-chain by the validators/relayers, then a proof is submitted. There’s minimal complexity for app developers as they just integrate the bridge’s API/contract. One complexity consideration is adding new chains – every validator must run a node or indexer for each new chain to observe messages, which can be a coordination headache (this was noted as a bottleneck for expansion in some multisig designs). Hyperlane’s answer is permissionless validators (anyone can join for a chain if the ISM includes them), but the application deploying the ISM still has to set up those keys initially. Overall, multisig models are easier to bootstrap across heterogeneous chains (no need for bespoke light client per chain), making them quicker to market, but they incur operational complexity off-chain and moderate on-chain verification costs.

  • Proof Aggregation (LayerZero): The complexity here is in the coordination of many possible verification methods. LayerZero provides a standardized interface (the Endpoint & MessageLib contracts) and expects DVNs to adhere to a certain verification API. From an application’s perspective, using LayerZero is quite simple (just call lzSend and implement lzReceive callbacks), but under the hood, there’s a lot going on. Each DVN may have its own off-chain infrastructure (some DVNs are essentially mini-bridges themselves, like an Axelar network or a Chainlink oracle service). The protocol itself is complex because it must securely aggregate disparate proof types – e.g. one DVN might supply an EVM block proof, another supplies a SNARK, another a signature, etc., and the contract has to verify each in turn. The advantage is that much of this complexity is abstracted away by LayerZero’s framework. The cost depends on how many and what type of proofs are required: verifying a snark might be expensive (on-chain zk proof verification can be hundreds of thousands of gas), whereas verifying a couple of signatures is cheaper. LayerZero lets the app decide how much they want to pay for security per message. There is also a concept of paying DVNs for their work – the message payload includes a fee for DVN services. For instance, an app can attach fees that incentivize DVNs and Executors to process the message promptly. This adds a cost dimension: a more secure configuration (using many DVNs or expensive proofs) will cost more in fees, whereas a simple 1-of-1 DVN (like a single relayer) could be very cheap but less secure. Upgradability and governance are also part of complexity: because apps can change their security stack, there needs to be a governance process or an admin key to do that – which itself is a point of trust/complexity to manage. In summary, proof aggregation via LayerZero is highly flexible but complex under the hood. The cost per message can be optimized by choosing efficient DVNs (e.g. using an ultra-light client that’s optimized, or leveraging an existing oracle network’s economies of scale). Many developers will find the plug-and-play nature (with defaults provided) appealing – e.g. simply use the default DVN set for ease – but that again can lead to suboptimal trust assumptions if not understood.

Upgradability and Governance

  • Light Clients (IBC): IBC connections and clients can be upgraded via on-chain governance proposals on the participant chains (particularly if the light client needs a fix or an update for a hardfork in the source chain). Upgrading the IBC protocol itself (say from IBC 2.0 to 3.0 features) also requires chain governance to adopt new versions of the software. This means IBC has a deliberate upgrade path – changes are slow and require consensus, but that is aligned with its security-first approach. There is no single entity that can flip a switch; governance of each chain must approve changes to clients or parameters. The positive is that this prevents unilateral changes that could introduce vulnerabilities. The negative is less agility – e.g. if a bug is found in a light client, it might take coordinated governance votes across many chains to patch (though there are emergency coordination mechanisms). From a dApp perspective, IBC doesn’t really have an “app-level governance” – it’s infrastructure provided by the chain. Applications just use IBC modules (like token transfer or interchain accounts) and rely on the chain’s security. So the governance and upgrades happen at the blockchain level (Hub and Zone governance). One interesting new IBC feature is custom channels and routing (e.g. hubs like Polymer or Nexus) that can allow switching underlying verification methods without interrupting apps. But by and large, IBC is stable and standardized – upgradability is possible but infrequent, contributing to its reliability.

  • Multisig Bridges (Hyperlane/Wormhole): These systems often have an admin or governance mechanism to upgrade contracts, change validator sets, or modify parameters. For example, adding a new validator to the set or rotating keys might require a multisig of the bridge owner or a DAO vote. Hyperlane being permissionless means any user could deploy their own ISM with a custom validator set, but if using the default, the Hyperlane team or community likely controls updates. Upgradability is a double-edged sword: on one hand, easy to upgrade/improve, on the other, it can be a centralization risk (if a privileged key can upgrade the bridge contracts, that key could theoretically rug the bridge). A well-governed protocol will limit this (e.g. time-lock upgrades, or use a decentralized governance). Hyperlane’s philosophy is modularity – so an app could even route around a failing component by switching ISMs, etc.. This gives developers power to respond to threats (e.g. if one set of validators is suspected to be compromised, an app could switch to a different security model quickly). The governance overhead is that apps need to decide their security model and potentially manage keys for their own validators or pay attention to updates from the Hyperlane core protocol. In summary, multisig-based systems are more upgradeable (the contracts are often upgradable and the committees configurable), which is good for rapid improvement and adding new chains, but it requires trust in the governance process. Many bridge exploits in the past have occurred via compromised upgrade keys or flawed governance, so this area must be treated carefully. On the plus side, adding support for a new chain might be as simple as deploying the contracts and getting validators to run nodes for it – no fundamental protocol change needed.

  • Proof Aggregation (LayerZero): LayerZero touts an immutable transport layer (the endpoint contracts are non-upgradable), but the verification modules (Message Libraries and DVN adapters) are append-only and configurable. In practice, this means the core LayerZero contract on each chain remains fixed (providing a stable interface), while new DVNs or verification options can be added over time without altering the core. Application developers have control over their Security Stack: they can add or remove DVNs, change confirmation block depth, etc. This is a form of upgradability at the app level. For example, if a particular DVN is deprecated or a new, better one emerges (like a faster zk client), the app team can integrate that into their config – future-proofing the dApp. The benefit is evident: apps aren’t stuck with yesterday’s security tech; they can adapt (with appropriate caution) to new developments. However, this raises governance questions: who within the app decides to change the DVN set? Ideally, if the app is decentralized, changes would go through governance or be hardcoded if they want immutability. If a single admin can alter the security stack, that’s a point of trust (they could reduce security requirements in a malicious upgrade). LayerZero’s own guidance encourages setting up robust governance for such changes or even making certain aspects immutable if needed. Another governance aspect is fee management – paying DVNs and relayers could be tuned, and misaligned incentives could impact performance (though by default market forces should adjust the fees). In sum, LayerZero’s model is highly extensible and upgradeable in terms of adding new verification methods (which is great for long-term interoperability), yet the onus is on each application to govern those upgrades responsibly. The base contracts of LayerZero are immutable to ensure the transport layer cannot be rug-pulled or censored, which inspires confidence that the messaging pipeline itself remains intact through upgrades.

To summarize the comparison, the table below highlights key differences:

AspectIBC (Light Clients)Hyperlane (Multisig)LayerZero v2 (Aggregation)
Trust ModelTrust the source chain’s consensus (no extra trust).Trust a committee of bridge validators (e.g. multisig threshold). Slashing can mitigate risk.Trust depends on DVNs chosen. Can emulate light client or multisig, or mix (trust at least one of chosen verifiers).
SecurityHighest – crypto proof of validity via light client. Attacks require compromising source chain or light client.Strong if committee is honest majority, but weaker than light client. Committee collusion or key compromise is primary threat.Potentially very high – can require multiple independent proofs (e.g. zk + multisig + oracle). But configurable security means it’s only as strong as the weakest DVNs chosen.
LivenessVery good as long as at least one relayer is active. Parallel relayers and fast finality chains give near real-time delivery.Good under normal conditions (fast signatures). But dependent on validator uptime. Threshold quorum downtime = halt. Expansion to new chains requires committee support.Very good; multiple DVNs provide redundancy, and any user can relay transactions. Required DVNs can be single points of failure if misconfigured. Latency can be tuned (e.g. wait for confirmations vs. speed).
CostUpfront complexity to implement clients. On-chain verification of consensus (signatures, Merkle proofs) but optimized in Cosmos. Low per-message cost in IBC-native environments; potentially expensive on non-native chains without special solutions.Lower dev complexity for core contracts. On-chain cost scales with number of signatures per message. Off-chain ops cost for validators (nodes on each chain). Possibly higher gas than light client if many sigs, but often manageable.Moderate-to-high complexity. Per-message cost varies: each DVN proof (sig or SNARK) adds verification gas. Apps pay DVN fees for service. Can optimize costs by choosing fewer or cheaper proofs for low-value messages.
UpgradabilityProtocol evolves via chain governance (slow, conservative). Light client updates require coordination, but standardization keeps it stable. Adding new chains requires building/approving new client types.Flexible – validator sets and ISMs can be changed via governance or admin. Easier to integrate new chains quickly. Risk if upgrade keys or governance are compromised. Typically upgradable contracts (needs trust in administrators).Highly modular – new DVNs/verification methods can be added without altering core. Apps can change security config as needed. Core endpoints immutable (no central upgrades), but app-level governance needed for security changes to avoid misuse.

Impact on Composability and Shared Liquidity in DeFi

Cross-chain messaging unlocks powerful new patterns for composability – the ability of DeFi contracts on different chains to interact – and enables shared liquidity – pooling assets across chains as if in one market. The security models discussed above influence how confidently and seamlessly protocols can utilize cross-chain features. Below we explore how each approach supports multi-chain DeFi, with real examples:

  • Omnichain DeFi via LayerZero (Stargate, Radiant, Tapioca): LayerZero’s generic messaging and Omnichain Fungible Token (OFT) standard are designed to break liquidity silos. For instance, Stargate Finance uses LayerZero to implement a unified liquidity pool for native assets bridging – rather than fragmented pools on each chain, Stargate contracts on all chains tap into a common pool, and LayerZero messages handle the lock/release logic across chains. This led to over $800 million monthly volume in Stargate’s bridges, demonstrating significant shared liquidity. By relying on LayerZero’s security (with Stargate presumably using a robust DVN set), users can transfer assets with high confidence in message authenticity. Radiant Capital is another example – a cross-chain lending protocol where users can deposit on one chain and borrow on another. It leverages LayerZero messages to coordinate account state across chains, effectively creating one lending market across multiple networks. Similarly, Tapioca (an omnichain money market) uses LayerZero v2 and even runs its own DVN as a required verifier to secure its messages. These examples show that with flexible security, LayerZero can support complex cross-chain operations like credit checks, collateral moves, and liquidations across chains. The composability comes from LayerZero’s “OApp” standard (Omnichain Application), which lets developers deploy the same contract on many chains and have them coordinate via messaging. A user interacts with any chain’s instance and experiences the application as one unified system. The security model allows fine-tuning: e.g. large transfers or liquidations could require more DVN signatures (for safety), whereas small actions go through faster/cheaper paths. This flexibility ensures neither security nor UX has to be one-size-fits-all. In practice, LayerZero’s model has greatly enhanced shared liquidity, evidenced by dozens of projects adopting OFT for tokens (so a token can exist “omnichain” rather than as separate wrapped assets). For example, stablecoins and governance tokens can use OFT to maintain a single total supply over all chains – avoiding liquidity fragmentation and arbitrage issues that plagued earlier wrapped tokens. Overall, by providing a reliable messaging layer and letting apps control the trust model, LayerZero has catalyzed new multi-chain DeFi designs that treat multiple chains as one ecosystem. The trade-off is that users and projects must understand the trust assumption of each omnichain app (since they can differ). But standards like OFT and widely used default DVNs help make this more uniform.

  • Interchain Accounts and Services in IBC (Cosmos DeFi): In the Cosmos world, IBC has enabled a rich tapestry of cross-chain functionality that goes beyond token transfers. A flagship feature is Interchain Accounts (ICA), which allows a blockchain (or a user on chain A) to control an account on chain B as if it were local. This is done via IBC packets carrying transactions. For example, the Cosmos Hub can use an interchain account on Osmosis to stake or swap tokens on behalf of a user – all initiated from the Hub. A concrete DeFi use-case is Stride’s liquid staking protocol: Stride (a chain) receives tokens like ATOM from users and, using ICA, it remotely stakes those ATOM on the Cosmos Hub and then issues stATOM (liquid staked ATOM) back to users. The entire flow is trustless and automated via IBC – Stride’s module controls an account on the Hub that executes delegate and undelegate transactions, with acknowledgments and timeouts ensuring safety. This demonstrates cross-chain composability: two sovereign chains performing a joint workflow (stake here, mint token there) seamlessly. Another example is Osmosis (a DEX chain) which uses IBC to draw in assets from 95+ connected chains. Users from any zone can swap on Osmosis by sending their tokens via IBC. Thanks to the high security of IBC, Osmosis and others confidently treat IBC tokens as genuine (not needing trusted custodians). This has led Osmosis to become one of the largest interchain DEXes, with daily IBC transfer volume reportedly exceeding that of many bridged systems. Moreover, with Interchain Queries (ICQ) in IBC 3.0, a smart contract on one chain can fetch data (like prices, interest rates, or positions) from another chain in a trust-minimized way. This could enable, for instance, an interchain yield aggregator that queries yield rates on multiple zones and reallocates assets accordingly, all via IBC messages. The key impact of IBC’s light-client model on composability is confidence and neutrality: chains remain sovereign but can interact without fear of a third-party bridge risk. Projects like Composable Finance and Polymer are even extending IBC to non-Cosmos ecosystems (Polkadot, Ethereum) to tap into these capabilities. The result might be a future where any chain that adopts an IBC client standard can plug into a “universal internet of blockchains”. Shared liquidity in Cosmos is already significant – e.g., the Cosmos Hub’s native DEX (Gravity DEX) and others rely on IBC to pool liquidity from various zones. However, a limitation so far is that cosmos DeFi is mostly asynchronous: you initiate on one chain, result happens on another with a slight delay (seconds). This is fine for things like trades and staking, but more complex synchronous composability (like flash loans across chains) remains out of scope due to fundamental latency. Still, the spectrum of cross-chain DeFi enabled by IBC is broad: multi-chain yield farming (move funds where yield is highest), cross-chain governance (one chain voting to execute actions on another via governance packets), and even Interchain Security where a consumer chain leverages the validator set of a provider chain (through IBC validation packets). In summary, IBC’s secure channels have fostered an interchain economy in Cosmos – one where projects can specialize on separate chains yet fluidly work together through trust-minimized messages. The shared liquidity is apparent in things like the flow of assets to Osmosis and the rise of Cosmos-native stablecoins that move across zones freely.

  • Hybrid and Other Multi-Chain Approaches (Hyperlane and beyond): Hyperlane’s vision of permissionless connectivity has led to concepts like Warp Routes for bridging assets and interchain dapps spanning various ecosystems. For example, a Warp Route might allow an ERC-20 token on Ethereum to be teleported to a Solana program, using Hyperlane’s message layer under the hood. One concrete user-facing implementation is Hyperlane’s Nexus bridge, which provides a UI for transferring assets between many chains via Hyperlane’s infrastructure. By using a modular security model, Hyperlane can tailor security per route: a small transfer might go through a simple fast path (just Hyperlane validators signing), whereas a large transfer could require an aggregated ISM (Hyperlane + Wormhole + Axelar all attest). This ensures that high-value liquidity movement is secured by multiple bridges – increasing confidence for, say, moving $10M of an asset cross-chain (it would take compromising multiple networks to steal it) at the cost of higher complexity/fees. In terms of composability, Hyperlane enables what they call “contract interoperability” – a smart contract on chain A can call a function on chain B as if it were local, once messages are delivered. Developers integrate the Hyperlane SDK to dispatch these cross-chain calls easily. An example could be a cross-chain DEX aggregator that lives partly on Ethereum and partly on BNB Chain, using Hyperlane messages to arbitrage between the two. Because Hyperlane supports EVM and non-EVM chains (even early work on CosmWasm and MoveVM integration), it aspires to connect “any chain, any VM”. This broad reach can increase shared liquidity by bridging ecosystems that aren’t otherwise easily connected. However, the actual adoption of Hyperlane in large-scale DeFi is still growing. It does not yet have the volume of Wormhole or LayerZero in bridging, but its permissionless nature has attracted experimentation. For example, some projects have used Hyperlane to quickly connect app-specific rollups to Ethereum, because they could set up their own validator set and not wait for complex light client solutions. As restaking (EigenLayer) grows, Hyperlane might see more uptake by offering Ethereum-grade security to any rollup with relatively low latency. This could accelerate new multi-chain compositions – e.g. an Optimism rollup and a Polygon zk-rollup exchanging messages through Hyperlane AVS, each message backed by slashed ETH if fraudulent. The impact on composability is that even ecosystems without a shared standard (like Ethereum and an arbitrary L2) can get a bridge contract that both sides trust (because it’s economically secured). Over time, this may yield a web of interconnected DeFi apps where composability is “dialed-in” by the developer (choosing which security modules to use for which calls).

In all these cases, the interplay between security model and composability is evident. Projects will only entrust large pools of liquidity to cross-chain systems if the security is rock-solid – hence the push for trust-minimized or economically secured designs. At the same time, the ease of integration (developer experience) and flexibility influence how creative teams can be in leveraging multiple chains. LayerZero and Hyperlane focus on simplicity for devs (just import an SDK and use familiar send/receive calls), whereas IBC, being lower-level, requires more understanding of modules and might be handled by the chain developers rather than application developers. Nonetheless, all three are driving towards a future where users interact with multi-chain dApps without needing to know what chain they’re on – the app seamlessly taps liquidity and functionality from anywhere. For example, a user of a lending app might deposit on Chain A and not even realize the borrow happened from a pool on Chain B – all covered by cross-chain messages and proper validation.

Implementations, Threat Models, and Adoption in Practice

It’s important to assess how these protocols are faring in real-world conditions – their current implementations, known threat vectors, and levels of adoption:

  • LayerZero v2 in Production: LayerZero v1 (with the 2-entity Oracle+Relayer model) gained significant adoption, securing over $50 billion in transfer volume and more than 134 million cross-chain messages as of mid-2024. It’s integrated with 60+ blockchains, primarily EVM chains but also non-EVM like Aptos, and experimental support for Solana is on the horizon. LayerZero v2 was launched in early 2024, introducing DVNs and modular security. Already, major platforms like Radiant Capital, SushiXSwap, Stargate, PancakeSwap, and others have begun migrating or building on v2 to leverage its flexibility. One notable integration is the Flare Network (a Layer1 focused on data), which adopted LayerZero v2 to connect with 75 chains at once. Flare was attracted by the ability to customize security: e.g. using a single fast DVN for low-value messages and requiring multiple DVNs for high-value ones. This shows that in production, applications are indeed using the “mix and match” security approach as a selling point. Security and audits: LayerZero’s contracts are immutable and have been audited (v1 had multiple audits, v2 as well). The main threat in v1 was the Oracle-Relayer collusion – if the two off-chain parties colluded, they could forge a message. In v2, that threat is generalized to DVN collusion. If all DVNs that an app relies on are compromised by one entity, a fake message could slip through. LayerZero’s answer is to encourage app-specific DVNs (so an attacker would have to compromise the app team too) and diversity of verifiers (making collusion harder). Another potential issue is misconfiguration or upgrade misuse – if an app owner maliciously switches to a trivial Security Stack (like 1-of-1 DVN controlled by themselves), they could bypass security to exploit their own users. This is more a governance risk than a protocol bug, and communities need to stay vigilant about how an omnichain app’s security is set (preferably requiring multi-sig or community approval for changes). In terms of adoption, LayerZero has arguably the most usage among messaging protocols in DeFi currently: it powers bridging for Stargate, Circle’s CCTP integration (for USDC transfers), Sushi’s cross-chain swap, many NFT bridges, and countless OFT tokens (projects choosing LayerZero to make their token available on multiple chains). The network effects are strong – as more chains integrate LayerZero endpoints, it becomes easier for new chains to join the “omnichain” network. LayerZero Labs itself runs one DVN and the community (including providers like Google Cloud, Polyhedra for zk proofs, etc.) has launched 15+ DVNs by 2024. No major exploit of LayerZero’s core protocol has occurred to date, which is a positive sign (though some application-level hacks or user errors have happened, as with any tech). The protocol’s design of keeping the transport layer simple (essentially just storing messages and requiring proofs) minimizes on-chain vulnerabilities, shifting most complexity off-chain to DVNs.

  • Hyperlane in Production: Hyperlane (formerly Abacus) is live on numerous chains including Ethereum, multiple L2s (Optimism, Arbitrum, zkSync, etc.), Cosmos chains like Osmosis via a Cosmos-SDK module, and even MoveVM chains (it’s quite broad in support). However, its adoption lags behind incumbents like LayerZero and Wormhole in terms of volume. Hyperlane is often mentioned in the context of being a “sovereign bridge” solution – i.e. a project can deploy Hyperlane to have their own bridge with custom security. For example, some appchain teams have used Hyperlane to connect their chain to Ethereum without relying on a shared bridge. A notable development is the Hyperlane Active Validation Service (AVS) launched in mid-2024, which is one of the first applications of Ethereum restaking. It has validators (many being top EigenLayer operators) restake ETH to secure Hyperlane messages, focusing initially on fast cross-rollup messaging. This is currently securing interoperability between Ethereum L2 rollups with good results – essentially providing near-instant message passing (faster than waiting for optimistic rollup 7-day exits) with economic security tied to Ethereum. In terms of threat model, Hyperlane’s original multisig approach could be attacked if enough validators’ keys are compromised (as with any multisig bridge). Hyperlane has had a past security incident: in August 2022, during an early testnet or launch, there was an exploit where an attacker was able to hijack the deployer key of a Hyperlane token bridge on one chain and mint tokens (around $700k loss). This was not a failure of the multisig itself, but rather operational security around deployment – it highlighted the risks of upgradability and key management. The team reimbursed losses and improved processes. This underscores that governance keys are part of the threat model – securing the admin controls is as important as the validators. With AVS, the threat model shifts to an EigenLayer context: if someone could cause a false slashing or avoid being slashed despite misbehavior, that would be an issue; but EigenLayer’s protocol handles slashing logic on Ethereum, which is robust assuming correct fraud proof submission. Hyperlane’s current adoption is growing in the rollup space and among some app-specific chains. It might not yet handle the multi-billion dollar flows of some competitors, but it is carving a niche where developers want full control and easy extensibility. The modular ISM design means we might see creative security setups: e.g., a DAO could require not just Hyperlane signatures but also a time-lock or a second bridge signature for any admin message, etc. Hyperlane’s permissionless ethos (anyone can run a validator or deploy to a new chain) could prove powerful long-term, but it also means the ecosystem needs to mature (e.g., more third-party validators joining to decentralize the default set; as of 2025 it’s unclear how decentralized the active validator set is in practice). Overall, Hyperlane’s trajectory is one of improving security (with restaking) and ease of use, but it will need to demonstrate resilience and attract major liquidity to gain the same level of community trust as IBC or even LayerZero.

  • IBC 3.0 and Cosmos Interop in Production: IBC has been live since 2021 and is extremely battle-tested within Cosmos. By 2025, it connects 115+ zones (including Cosmos Hub, Osmosis, Juno, Cronos, Axelar, Kujira, etc.) with millions of transactions per month and multi-billion dollar token flows. It has impressively had no major security failures at the protocol level. There has been one notable IBC-related incident: in October 2022, a critical vulnerability in the IBC code (affecting all v2.0 implementations) was discovered that could have allowed an attacker to drain value from many IBC-connected chains. However, it was fixed covertly via coordinated upgrades before it was publicly disclosed, and no exploit occurred. This was a wake-up call that even formally verified protocols can have bugs. Since then, IBC has seen further auditing and hardening. The threat model for IBC mainly concerns chain security: if one connected chain is hostile or gets 51% attacked, it could try to feed invalid data to a counterparty’s light client. Mitigations include using governance to halt or close connections to chains that are insecure (Cosmos Hub governance, for example, can vote to turn off client updates for a particular chain if it’s detected broken). Also, IBC clients often have unbonding period or trusting period alignment – e.g., a Tendermint light client won’t accept a validator set update older than the unbonding period (to prevent long-range attacks). Another possible issue is relayer censorship – if no relayer delivers packets, funds could be stuck in timeouts; but because relaying is permissionless and often incentivized, this is typically transient. With IBC 3.0’s Interchain Queries and new features rolling out, we see adoption in things like Cross-Chain DeX aggregators (e.g., Skip Protocol using ICQ to gather price data across chains) and cross-chain governance (e.g., Cosmos Hub using interchain accounts to manage Neutron, a consumer chain). The adoption beyond Cosmos is also a story: projects like Polymer and Astria (an interop hub for rollups) are effectively bringing IBC to Ethereum rollups via a hub/spoke model, and Polkadot’s parachains have successfully used IBC to connect with Cosmos chains (e.g., Centauri bridge between Cosmos and Polkadot, built by Composable Finance, uses IBC under the hood with a GRANDPA light client on Cosmos side). There’s even an IBC-Solidity implementation in progress by Polymer and DataChain that would allow Ethereum smart contracts to verify IBC packets (using a light client or validity proofs). If these efforts succeed, it could dramatically broaden IBC’s usage beyond Cosmos, bringing its trust-minimized model into direct competition with the more centralized bridges on those chains. In terms of shared liquidity, Cosmos’s biggest limitation was the absence of a native stablecoin or deep liquidity DEX on par with Ethereum’s – that is changing with the rise of Cosmos-native stablecoins (like IST, CMST) and the connection of assets like USDC (Axelar and Gravity bridge brought USDC, and now Circle is launching native USDC on Cosmos via Noble). As liquidity deepens, the combination of high security and seamless IBC transfers could make Cosmos a nexus for multi-chain DeFi trading – indeed, the Blockchain Capital report noted that IBC was already handling more volume than LayerZero or Wormhole by the start of 2024, albeit that’s mostly on the strength of Cosmos-to-Cosmos traffic (which suggests a very active interchain economy). Going forward, IBC’s main challenge and opportunity is expanding to heterogeneous chains without sacrificing its security ethos.

In summary, each protocol is advancing: LayerZero is rapidly integrating with many chains and applications, prioritizing flexibility and developer adoption, and mitigating risks by enabling apps to be part of their own security. Hyperlane is innovating with restaking and modularity, aiming to be the easiest way to connect new chains with configurable security, though it’s still building trust and usage. IBC is the gold standard in trustlessness within its domain, now evolving to be faster (IBC 3.0) and hoping to extend its domain beyond Cosmos, backed by a strong track record. Users and projects are wise to consider the maturity and security incidents of each: IBC has years of stable operation (and huge volume) but limited to certain ecosystems; LayerZero has quickly amassed usage but requires understanding custom security settings; Hyperlane is newer in execution but promising in vision, with careful steps toward economic security.

Conclusion and Outlook: Interoperability Architecture for the Multi-Chain Future

The long-term viability and interoperability of the multi-chain DeFi landscape will likely be shaped by all three security models co-existing and even complementing each other. Each approach has clear strengths, and rather than a one-size-fits-all solution, we may see a stack where the light client model (IBC) provides the highest assurance base for key routes (especially among major chains), while proof-aggregated systems (LayerZero) provide universal connectivity with customizable trust, and multisig models (Hyperlane and others) serve niche needs or bootstrap new ecosystems quickly.

Security vs. Connectivity Trade-off: Light clients like IBC offer the closest thing to a “blockchain internet” – a neutral, standardized transport layer akin to TCP/IP. They ensure that interoperability doesn’t introduce new weaknesses, which is critical for long-term sustainability. However, they require broad agreement on standards and significant engineering per chain, which slows down how fast new connections can form. LayerZero and Hyperlane, on the other hand, prioritize immediate connectivity and flexibility, acknowledging that not every chain will implement the same protocol. They aim to connect “any to any,” even if that means accepting a bit more trust in the interim. Over time, we can expect the gap to narrow: LayerZero can incorporate more trust-minimized DVNs (even IBC itself could be wrapped in a DVN), and Hyperlane can use economic mechanisms to approach the security of native verification. Indeed, the Polymer project envisions that IBC and LayerZero need not be competitors but can be layered – for example, LayerZero could use an IBC light client as one of its DVNs when available. Such cross-pollination is likely as the space matures.

Composability and Unified Liquidity: From a DeFi user’s perspective, the ultimate goal is that liquidity becomes chain-agnostic. We’re already seeing steps: with omnichain tokens (OFTs) you don’t worry which chain your token version is on, and with cross-chain money markets you can borrow on any chain against collateral on another. The architectural choices directly affect user trust in these systems. If a bridge hack occurs (as happened with some multisig bridges historically), it fractures confidence and thus liquidity – users retreat to safer venues or demand risk premiums. Thus, protocols that consistently demonstrate security will underpin the largest pools of liquidity. Cosmos’s interchain security and IBC have shown one path: multiple order-books and AMMs across zones essentially compose into one large market because transfers are trustless and quick. LayerZero’s Stargate showed another: a unified liquidity pool can service many chains’ transfers, but it required users to trust LayerZero’s security assumption (Oracle+Relayer or DVNs). As LayerZero v2 lets each pool set even higher security (e.g. use multiple big-name validator networks to verify every transfer), it’s reducing the trust gap. The long-term viability of multi-chain DeFi likely hinges on interoperability protocols being invisible yet reliable – much like internet users don’t think about TCP/IP, crypto users shouldn’t have to worry about which bridge or messaging system a dApp uses. That will happen when security models are robust enough that failures are exceedingly rare and when there’s some convergence or composability between these interoperability networks.

Interoperability of Interoperability: It’s conceivable that in a few years, we won’t talk about LayerZero vs Hyperlane vs IBC as separate realms, but rather a layered system. For example, an Ethereum rollup could have an IBC connection to a Cosmos hub via Polymer, and that Cosmos hub might have a LayerZero endpoint as well, allowing messages to transit from the rollup into LayerZero’s network through a secure IBC channel. Hyperlane could even function as a fallback or aggregation: an app could require both an IBC proof and a Hyperlane AVS signature for ultimate assurance. This kind of aggregation of security across protocols could address even the most advanced threat models (it’s much harder to simultaneously subvert an IBC light client and an independent restaked multisig, etc.). Such combinations will of course add complexity and cost, so they’d be reserved for high-value contexts.

Governance and Decentralization: Each model puts differing power in the hands of different actors – IBC in the hands of chain governance, LayerZero in the hands of app developers (and indirectly, the DVN operators they choose), and Hyperlane in the hands of the bridge validators and possibly restakers. The long-term interoperable landscape will need to ensure no single party or cartel can dominate cross-chain transactions. This is a risk, for instance, if one protocol becomes ubiquitous but is controlled by a small set of actors; it could become a chokepoint (analogous to centralized internet service providers). The way to mitigate that is by decentralizing the messaging networks themselves (more relayers, more DVNs, more validators – all permissionless to join) and by having alternative paths. On this front, IBC has the advantage of being an open standard with many independent teams, and LayerZero and Hyperlane are both moving to increase third-party participation (e.g. anyone can run a LayerZero DVN or Hyperlane validator). It’s likely that competition and open participation will keep these services honest, much like miners/validators in L1s keep the base layer decentralized. The market will also vote with its feet: if one solution proves insecure or too centralized, developers can migrate to another (especially as bridging standards become more interoperable themselves).

In conclusion, the security architectures of LayerZero v2, Hyperlane, and IBC 3.0 each contribute to making the multi-chain DeFi vision a reality, but with different philosophies. Light clients prioritize trustlessness and neutrality, multisigs prioritize pragmatism and ease of integration, and aggregated approaches prioritize customization and adaptability. The multi-chain DeFi landscape of the future will likely use a combination of these: critical infrastructure and high-value transfers secured by trust-minimized or economically-secured methods, and flexible middleware to connect to the long tail of new chains and apps. With these in place, users will enjoy unified liquidity and cross-chain composability with the same confidence and ease as using a single chain. The path forward is one of convergence – not necessarily of the protocols themselves, but of the outcomes: a world where interoperability is secure, seamless, and standard. Achieving that will require continued rigorous engineering (to avoid exploits), collaborative governance (to set standards like IBC or universal contract interfaces), and perhaps most importantly, an iterative approach to security that blends the best of all worlds: math, economic incentives, and intelligent design. The end-state might truly fulfill the analogy often cited: blockchains interconnected like networks on the internet, with protocols like LayerZero, Hyperlane, and IBC forming the omnichain highway that DeFi will ride on for the foreseeable future.

Sources:

  • LayerZero v2 architecture and DVN security – LayerZero V2 Deep Dive; Flare x LayerZero V2 announcement
  • Hyperlane multisig and modular ISM – Hyperlane Docs: Validators; Tiger Research on Hyperlane; Hyperlane restaking (AVS) announcement
  • IBC 3.0 light clients and features – IBC Protocol Overview; 3Commas Cosmos 2025 (IBC 3.0)
  • Comparison of trust assumptions – Nosleepjohn (Hyperlane) on bridge tradeoffs; IBC vs bridges (Polymer blog)
  • DeFi examples (Stargate, ICA, etc.) – Flare blog on LayerZero (Stargate volume); IBC use cases (Stride liquid staking); LayerZero Medium (OFT and OApp standards); Hyperlane use cases
  • Adoption and stats – Flare x LayerZero (cross-chain messages, volume); Range.org on IBC volume; Blockchain Capital on IBC vs bridges; LayerZero blog (15+ DVNs); IBC testimonials (Osmosis, etc.).

Formal Verification of Smart Contracts and AI-Assisted Auditing

· 39 min read
Dora Noda
Software Engineer

Formal Verification in Smart Contract Auditing

Formal verification refers to the use of mathematical and logic-based techniques to prove the correctness and security of smart contracts. In practice, this encompasses a spectrum of methodologies: from property-based fuzz testing and symbolic execution, to rigorous theorem proving and model checking. The goal is to ensure a contract meets its specifications and contains no exploitable bugs across all possible inputs and states. Given the high stakes (billions of dollars are locked in DeFi protocols), formal methods have become increasingly important for Ethereum and other blockchain platforms.

Traditional Approaches: Early formal methods for Ethereum included symbolic execution tools like Oyente and Mythril, and static analyzers like Slither and Securify. Symbolic execution explores program paths with symbolic inputs to detect issues (e.g. reentrancy, integer overflow), while static analysis uses rule-based pattern matching. These tools have had success but also limitations: for example, Oyente suffered many false alarms even on simple contracts, and Slither’s pattern-based detectors can produce several false positives. Moreover, a 2023 study found that over 80% of exploitable contract bugs (especially complex “business logic” bugs) were missed by current tools, underscoring the need for more robust verification techniques.

The Promise and Challenge of Full Verification: In theory, formal verification can prove the absence of bugs by exhaustively checking invariants for all states. Tools like the Certora Prover or the Ethereum Foundation’s KEVM framework aim to mathematically verify smart contracts against a formal specification. For example, Certora’s “automated mathematical auditor” uses a specification language (CVL) to prove or refute user-defined rules. In practice, however, fully proving properties on real-world contracts is often unattainable or very labor-intensive. Code may need to be rewritten into simplified forms for verification, custom specs must be written, loops and complex arithmetic might require manual bounds or abstractions, and SMT solvers frequently time out on complex logic. As Trail of Bits engineers noted, “proving the absence of bugs is typically unattainable” on non-trivial codebases, and achieving it often requires heavy user intervention and expertise. Because of this, formal verification tools have traditionally been used sparingly for critical pieces of code (e.g. verifying a token’s invariant or a consensus algorithm), rather than entire contracts end-to-end.

Foundry’s Fuzz Testing and Invariant Testing

In recent years, property-based testing has emerged as a practical alternative to full formal proofs. Foundry, a popular Ethereum development framework, has built-in support for fuzz testing and invariant testing – techniques that greatly enhance test coverage and can be seen as lightweight formal verification. Foundry’s fuzz testing automatically generates large numbers of inputs to try to violate specified properties, and invariant testing extends this to sequences of state-changing operations:

  • Fuzz Testing: Instead of writing unit tests for specific inputs, the developer specifies properties or invariants that should hold for any input. Foundry then generates hundreds or thousands of random inputs to test the function and checks that the property always holds. This helps catch edge cases that a developer might not manually think to test. For example, a fuzz test might assert that a function’s return value is always non-negative or that a certain post-condition is true regardless of inputs. Foundry’s engine uses intelligent heuristics – it analyzes function signatures and introduces edge-case values (0, max uint, etc.) – to hit corner cases likely to break the property. If an assertion fails, Foundry reports a counterexample input that violates the property.

  • Invariant Testing: Foundry’s invariant testing (also called stateful fuzzing) goes further by exercising multiple function calls and state transitions in sequence. The developer writes invariant functions that should hold true throughout the contract’s lifetime (e.g. total assets = sum of user balances). Foundry then randomly generates sequences of calls (with random inputs) to simulate many possible usage scenarios, periodically checking that the invariant conditions remain true. This can uncover complex bugs that manifest only after a particular sequence of operations. Essentially, invariant testing explores the contract’s state space more thoroughly, ensuring that no sequence of valid transactions can violate the stated properties.

Why Foundry Matters: Foundry has made these advanced testing techniques accessible. The fuzzing and invariant features are native to the developer workflow – no special harness or external tool is needed, and tests are written in Solidity alongside unit tests. Thanks to a Rust-based engine, Foundry can execute thousands of tests quickly (parallelizing them) and provide detailed failure traces for any invariant violation. Developers report that Foundry’s fuzzer is easy to use and highly performant, requiring only minor configuration (e.g. setting the number of iterations or adding assumptions to constrain inputs). A simple example from Foundry’s documentation is a fuzz test for a divide(a,b) function, which uses vm.assume(b != 0) to avoid trivial invalid inputs and then asserts mathematical post-conditions like result * b <= a. By running such a test with thousands of random (a,b) pairs, Foundry might quickly discover edge cases (like overflow boundaries) that would be hard to find by manual reasoning.

Comparisons: Foundry’s approach builds on prior work in the community. Trail of Bits’ Echidna fuzzer was an earlier property-based testing tool for Ethereum. Echidna similarly generates random transactions to find violations of invariants expressed as Solidity functions returning a boolean. It’s known for “intelligent” input generation (incorporating coverage-guided fuzzing) and has been used to find many bugs. In fact, security researchers note that Echidna’s engine is extremely effective – “the Trail of Bits Echidna is the best fuzzer out there due to its intelligent random number selection” – though Foundry’s integrated workflow makes writing tests simpler for developers. In practice, Foundry’s fuzz testing is often regarded as the new “bare minimum” for secure Solidity development, complementing traditional unit tests. It cannot prove the absence of bugs (since it’s randomized and not exhaustive), but it greatly increases confidence by covering a vast range of inputs and state combinations.

Beyond Fuzzing: Formal Proofs and Advanced Tools

While fuzzing and invariants catch many issues, there are cases where stronger formal methods are used. Model checking and theorem proving involve specifying desired properties in a formal logic and using automated provers to check them against the contract code. Certora Prover (recently open-sourced) is a prominent tool in this category. It allows developers to write rules in a domain-specific language (CVL) and then automatically checks the contract for violations of those rules. Certora has been used to verify critical invariants in protocols like MakerDAO and Compound; for instance, it identified the “DAI debt invariant” bug in MakerDAO (a subtle accounting inconsistency) that went unnoticed for four years. Notably, Certora’s engine now supports multiple platforms (EVM, Solana’s VM, and eWASM), and by open-sourcing it in 2025, it made industrial-grade formal verification freely available on Ethereum, Solana, and Stellar. This move recognizes that formal proofs should not be a luxury for only well-funded teams.

Other formal tools include runtime verification approaches (e.g. the KEVM semantics of Ethereum, or the Move Prover for Move-based chains). The Move Prover in particular is built into the Move language (used by Aptos and Sui blockchains). It allows writing formal specifications alongside the code and can automatically prove certain properties with a user experience similar to a linter or type-checker. This tight integration lowers the barrier for developers on those platforms to use formal verification as part of development.

Summary: Today’s smart contract auditing blends these methodologies. Fuzzing and invariant testing (exemplified by Foundry and Echidna) are widely adopted for their ease of use and effectiveness in catching common bugs. Symbolic execution and static analyzers still serve to quickly scan for known vulnerability patterns (with tools like Slither often integrated into CI pipelines). Meanwhile, formal verification tools are maturing and expanding across chains, but they are typically reserved for specific critical properties or used by specialized auditing firms due to their complexity. In practice, many audits combine these approaches: e.g. using fuzzers to find runtime errors, static tools to flag obvious mistakes, and formal spec checks for key invariants like “no token balance exceeds total supply”.

AI-Assisted Auditing with Large Language Models

The advent of large language models (LLMs) like OpenAI’s GPT-3/4 (ChatGPT) and Codex has introduced a new paradigm for smart contract analysis. These models, trained on vast amounts of code and natural language, can reason about program behavior, explain code, and even detect certain vulnerabilities by pattern recognition and “common sense” knowledge. The question is: can AI augment or even automate smart contract auditing?

LLM-Based Vulnerability Analysis Tools

Several research efforts and prototype tools have emerged that apply LLMs to smart contract auditing:

  • OpenAI Codex / ChatGPT (general models): Early experiments simply involved prompting GPT-3 or Codex with contract code and asking for potential bugs. Developers found that ChatGPT could identify some well-known vulnerability patterns and even suggest fixes. However, responses were hit-or-miss and not reliably comprehensive. A recent academic evaluation noted that naive prompting of ChatGPT for vulnerability detection “did not yield significantly better outcomes compared to random prediction” on a benchmark dataset – essentially, if the model is not guided properly, it may hallucinate issues that aren’t there or miss real vulnerabilities. This highlighted the need for carefully engineered prompts or fine-tuning to get useful results.

  • AuditGPT (2024): This is an academic tool that leveraged ChatGPT (GPT-3.5/4) specifically to check ERC standard compliance in Ethereum contracts. The researchers observed that many ERC20/ERC721 token contracts violate subtle rules of the standard (which can lead to security or compatibility issues). AuditGPT works by breaking down the audit into small tasks and designing specialized prompts for each rule type. The result was impressive: in tests on real contracts, AuditGPT “successfully pinpoints 418 ERC rule violations and only reports 18 false positives”, demonstrating high accuracy. In fact, the paper claims AuditGPT outperformed a professional auditing service in finding ERC compliance bugs, at lower cost. This suggests that for well-defined, narrow domains (like enforcing a list of standard rules), an LLM with good prompts can be remarkably effective and precise.

  • LLM-SmartAudit (2024): Another research project, LLM-SmartAudit, takes a broader approach by using a multi-agent conversation system to audit Solidity code. It sets up multiple specialized GPT-3.5/GPT-4 agents (e.g. one “Auditor” focusing on correctness, one “Attacker” thinking of how to exploit) that talk to each other to analyze a contract. In their evaluation, LLM-SmartAudit was able to detect a wide range of vulnerabilities. Notably, in a comparative test against conventional tools, the GPT-3.5 based system achieved the highest overall recall (74%), outperforming all the traditional static and symbolic analyzers they tested (the next best was Mythril at 54% recall). It also was able to find all of the 10 types of vulnerabilities in their test suite (whereas each traditional tool struggled with some categories). Moreover, by switching to GPT-4 and focusing the analysis (what they call Targeted Analysis mode), the system could detect complex logical flaws that tools like Slither and Mythril completely missed. For instance, on a set of real-world DeFi contracts, the LLM-based approach found hundreds of logic bugs whereas the static tools “demonstrated no effectiveness in detecting” those issues. These results showcase the potential of LLMs to catch subtle bugs that are beyond the pattern-matching scope of traditional analyzers.

  • Prometheus (PromFuzz) (2023): A hybrid approach is to use LLMs to guide other techniques. One example is PromFuzz, which uses a GPT-based “auditor agent” to identify suspect areas in the code, then automatically generates invariant checkers and feeds them into a fuzzing engine. Essentially, the LLM does a first-pass analysis (both from a benign and attacker perspective) to focus the fuzz testing on likely vulnerabilities. In evaluations, this approach achieved very high bug-finding rates – e.g. over 86% recall with zero false positives in certain benchmark sets – significantly outperforming standalone fuzzers or previous tools. This is a promising direction: using AI to orchestrate and enhance classical techniques like fuzzing, combining the strengths of both.

  • Other AI Tools: The community has seen various other AI-assisted auditing concepts. For example, Trail of Bits’ “Toucan” project integrated OpenAI Codex into their audit workflow (more on its findings below). Some security startups are also advertising AI auditors (e.g. “ChainGPT Audit” services), typically wrapping GPT-4 with custom prompts to review contracts. Open-source enthusiasts have created ChatGPT-based audit bots on forums that will take a Solidity snippet and output potential issues. While many of these are experimental, the general trend is clear: AI is being explored at every level of the security review process, from automated code explanation and documentation generation to vulnerability detection and even suggesting fixes.

Capabilities and Limitations of LLM Auditors

LLMs have demonstrated notable capabilities in smart contract auditing:

  • Broad Knowledge: An LLM like GPT-4 has been trained on countless codes and vulnerabilities. It “knows” about common security pitfalls (reentrancy, integer overflow, etc.) and even some historical exploits. This allows it to recognize patterns that might indicate a bug, and to recall best practices from documentation. For example, it might remember that ERC-20 transferFrom should check allowances (and flag the absence of such a check as a violation). Unlike a static tool that only flags known patterns, an LLM can apply reasoning to novel code and infer problems that weren’t explicitly coded into it by a tool developer.

  • Natural Explanations: AI auditors can provide human-readable explanations of potential issues. This is extremely valuable for developer experience. Traditional tools often output cryptic warnings that require expertise to interpret, whereas a GPT-based tool can generate a paragraph explaining the bug in plain English and even suggest a remediation. AuditGPT, for instance, not only flagged ERC rule violations but described why the code violated the rule and what the implications were. This helps in onboarding less-experienced developers to security concepts.

  • Flexibility: With prompt engineering, LLMs can be directed to focus on different aspects or follow custom security policies. They are not limited by a fixed set of rules – if you can describe a property or pattern in words, the LLM can attempt to check it. This makes them attractive for auditing new protocols that might have unique invariants or logic (where writing a custom static analysis from scratch would be infeasible).

However, significant challenges and reliability issues have been observed:

  • Reasoning Limitations: Current LLMs sometimes struggle with the complex logical reasoning required for security analysis. Trail of Bits reported that “the models are not able to reason well about certain higher-level concepts, such as ownership of contracts, re-entrancy, and inter-function relationships”. For example, GPT-3.5/4 might understand what reentrancy is in theory (and even explain it), but it may fail to recognize a reentrancy vulnerability if it requires understanding a cross-function sequence of calls and state changes. The model might also miss vulnerabilities that involve multi-contract interactions or time-dependent logic, because those go beyond the scope of a single code snippet analysis.

  • False Positives and Hallucinations: A major concern is that LLMs can produce confident-sounding yet incorrect conclusions. In auditing, a “hallucination” might be flagging a vulnerability that isn’t actually there, or misinterpreting the code. Trail of Bits’ experiment with Codex (GPT) found that as they scaled to larger real-world contracts, “the false positive and hallucination rates [became] too high,” to the point that it would slow down auditors with too many spurious reports. This unpredictability is problematic – a tool that cries wolf too often will not be trusted by developers. AuditGPT’s success in keeping false positives low (only 18 out of hundreds of findings) is encouraging, but that was in a constrained domain. In general-purpose use, careful prompt design and maybe human review loops are needed to filter AI findings.

  • Context Limitations: LLMs have a context window limitation, meaning they can only “see” a certain amount of code at once. Complex contracts often span multiple files and thousands of lines. If an AI can’t ingest the whole codebase, it might miss important connections. Techniques like code slicing (feeding the contract in chunks) are used, but they risk losing the broader picture. The LLM-SmartAudit team noted that with GPT-3.5’s 4k token limit, they could not analyze some large real-world contracts until they switched to GPT-4 with a larger context. Even then, dividing analysis into parts can cause it to overlook bugs that manifest only when considering the system as a whole. This makes AI analysis of entire protocols (with multiple interacting contracts) a real challenge as of now.

  • Integration and Tooling: There is a lack of robust developer tooling around AI auditors. Running an LLM analysis is not as straightforward as running a linter. It involves API calls to a model, handling prompt engineering, rate limits, and parsing the AI’s responses. “The software ecosystem around integrating LLMs with traditional software is too crude and everything is cumbersome”, as one auditing team put it. There are virtually no off-the-shelf frameworks for continuously deploying an AI auditor in CI pipelines while managing its uncertainties. This is slowly improving (some projects are exploring CI bots that use GPT-4 for code review), but it’s early. Moreover, debugging why an AI gave a certain result is difficult – unlike deterministic tools, you can’t easily trace the logic that led the model to flag or miss something.

  • Cost and Performance: Using large models like GPT-4 is expensive and can be slow. If you want to incorporate an AI audit into a CI/CD pipeline, it might add several minutes per contract and incur significant API costs for large code. Fine-tuned models or open-source LLMs could mitigate cost, but they tend to be less capable than GPT-4. There’s ongoing research into smaller, specialized models for code security, but at present the top results have come from the big proprietary models.

In summary, LLM-assisted auditing is promising but not a silver bullet. We are seeing hybrid approaches where AI helps generate tests or analyze specific aspects, rather than doing the entire audit end-to-end. For instance, an AI might suggest potential invariants or risky areas, which a human or another tool then investigates. As one security engineer remarked, the technology is not yet ready to replace human auditors for critical tasks, given the reasoning gaps and integration hurdles. However, it can already be a useful assistant – “something imperfect may be much better than nothing” in cases where traditional tools fall short.

Accuracy and Reliability of Different Toolchains

It is instructive to compare the accuracy, coverage, and reliability of the various auditing approaches discussed. Below is a summary of findings from research and industry evaluations:

  • Static Analysis Tools: Static analyzers like Slither are valued for quick feedback and ease of use. They typically have high precision but moderate recall – meaning that most of the issues they report are true problems (few false positives by design), but they only detect certain classes of vulnerabilities. A 2024 benchmarking study (LLM-SmartAudit’s RQ1) found Slither caught about 46% of the known vulnerabilities in a test suite. Mythril (symbolic execution) did slightly better at 54% recall. Each tool had strengths in particular bug types (e.g. Slither is very good at spotting arithmetic issues or usage of tx.origin, while symbolic tools excel at finding simple reentrancy scenarios), but none had comprehensive coverage. False positives for mature tools like Slither are relatively low – its developers advertise “minimal false alarms and speedy execution (under 1s per contract)”, making it suitable for CI use. Nonetheless, static tools can misfire if code uses complex patterns; they might flag edge cases that are actually handled or miss deep logic bugs that don’t match any known anti-pattern.

  • Fuzzing and Property Testing: Fuzzers like Foundry’s fuzz/invariant tests or Trail of Bits’ Echidna have proven very effective at finding runtime errors and invariant violations. These tools tend to have near zero false positives in the sense that if a bug is reported (an assertion failed), it’s a real counterexample execution. The trade-off is in false negatives – if a bug doesn’t manifest within the tested input space or number of runs, it can slip by. Coverage depends on how well the fuzzer explores the state space. With enough time and good heuristics, fuzzers have found many high-severity bugs that static analysis missed. For example, Echidna was able to quickly reproduce the MakerDAO and Compound bugs that took formal verification efforts to find. However, fuzzing is not guaranteed to find every logic mistake. As contracts get more complex, fuzzers might require writing additional invariants or using smarter search strategies to hit deeper states. In terms of metrics, fuzzing doesn’t have a single “recall” number, but anecdotal evidence shows that important invariants can usually be broken by a fuzzer if they are breakable. The reliability is high for what it finds (no manual triage needed for false reports), but one must remember a passed fuzz test is not a proof of correctness – just an increase in confidence.

  • Formal Verification Tools: When applicable, formal verification provides the highest assurance – a successful proof means 100% of states satisfy the property. In terms of accuracy, it’s effectively perfect (sound and complete) for the properties it can prove. The biggest issue here is not the accuracy of results but the difficulty of use and narrow scope. Formal tools can also have false negatives in practice: they might simply be unable to prove a true property due to complexity (yielding no result or a timeout, which isn’t a false positive, but it means we fail to verify something that is actually safe). They can also have specification errors, where the tool “proves” something that wasn’t the property you intended (user error). In real audits, formal methods have caught some critical bugs (Certora’s successes include catching a subtle SushiSwap bug and a PRBMath library issue before deployment). But their track record is limited by how rarely they are comprehensively applied – as Trail of Bits noted, it was “difficult to find public issues discovered solely through formal verification, in contrast with the many bugs found by fuzzing”. So, while formal verification is extremely reliable when it succeeds, its impact on overall toolchain coverage is constrained by the effort and expertise required.

  • LLM-Based Analysis: The accuracy of AI auditors is currently a moving target, as new techniques (and newer models) are rapidly pushing the envelope. We can glean a few data points:

    • The AuditGPT system, focused on ERC rules, achieved very high precision (≈96% by false positive count) and found hundreds of issues that static tools or humans overlooked. But this was in a narrow domain with structured prompts. We should not generalize that ChatGPT will be 96% accurate on arbitrary vulnerability hunting – outside of a controlled setup, its performance is lower.
    • In broader vulnerability detection, LLM-SmartAudit (GPT-3.5) showed ~74% recall on a benchmark with moderate false positive rate, which is better than any single traditional tool. When upgraded to GPT-4 with specialized prompting (TA mode), it significantly improved – for example, on a set of 1,400 real-world vulnerabilities, the GPT-4 agent found ~48% of the specific issues and ~47% of the complex logic issues, whereas Slither/Mythril/Conkas each found ~0% (none) of those particular complex issues. This demonstrates that LLMs can dramatically expand coverage to types of bugs that static analysis completely misses. On the flip side, the LLM did not find over half of the issues (so it also has substantial false negatives), and it’s not clear how many false positives were among the ones it reported – the study focused on recall more than precision.
    • Trail of Bits’ Codex/GPT-4 “Toucan” experiment is illustrative of reliability problems. Initially, on small examples, Codex could identify known vulnerabilities (ownership issues, reentrancy) with careful prompting. But as soon as they tried scaling up, they encountered inconsistent and incorrect results. They reported “the number of failures was high even in average-sized code” and difficult to predict. Ultimately, they concluded that GPT-4 (as of early 2023) was only an incremental improvement and still “missing key features” like reasoning about cross-function flows. The outcome was that the AI did not materially reduce false positives from their static tools, nor did it reliably speed up their audit workflow. In other words, the current reliability of a general LLM as an auditor was deemed insufficient by professional auditors in that trial.

To sum up, each toolchain has different strengths:

  • Static tools: Reliable for quick detection of known issues; low noise, but limited bug types (medium recall ~40–50% on benchmarks).
  • Fuzz/invariant testing: Very high precision (almost no false alerts) and strong at finding functional and state-dependent bugs; coverage can be broad but not guaranteed (no simple metric, depends on time and invariant quality). Often finds the same bugs formal proofs would if given enough guidance.
  • Formal verification: Gold standard for absolute correctness on specific properties; essentially 100% recall/precision for those properties if successfully applied. But practically limited to narrow problems and requires significant effort (not a one-button solution for full audits yet).
  • AI (LLM) analysis: Potentially high coverage – can find bugs across categories including those missed by other tools – but variable precision. With specialized setups, it can be both precise and broad (as AuditGPT showed for ERC checks). Without careful constraints, it may cast a wide net and require human vetting of results. The “average” accuracy of ChatGPT on vulnerability detection is not spectacular (close to guessing, in one study), but the best-case engineered systems using LLMs are pushing performance beyond traditional tools. There’s active research to make AI outputs more trustworthy (e.g. by having multiple agents cross-verify, or combining LLM with static reasoning to double-check AI conclusions).

It’s worth noting that combining approaches yields the best results. For example, one might run Slither (to catch low-hanging fruit with no noise), then use Foundry/Echidna to fuzz deeper behaviors, and perhaps use an LLM-based tool to scan for any patterns or invariants not considered. Each will cover different blind spots of the others.

Real-World Adoption Challenges and Limitations

In practice, adopting formal verification or AI tools in a development workflow comes with pragmatic challenges. Some key issues include:

  • Developer Experience and Expertise: Traditional formal verification has a steep learning curve. Writing formal specs (in CVL, Coq, Move’s specification language, etc.) is more like writing mathematical proofs than writing code. Many developers lack this background, and formal methods experts are in short supply. By contrast, fuzzing with Foundry or writing invariants in Solidity is much more accessible – it feels like writing additional tests. This is a big reason fuzz testing has seen faster uptake than formal proofs in the Ethereum community. The Trail of Bits team explicitly noted that fuzzing “produces similar results but requires significantly less skill and time” than formal methods in many cases. Thus, even though formal verification can catch different bugs, many teams opt for the easier approach that gets good enough results with lower effort.

  • Integration into Development Workflow: For a tool to be widely adopted, it needs to fit into CI/CD pipelines and everyday coding. Tools like Slither shine here – it “easily integrates into CI/CD setups, streamlining automation and aiding developers.” A developer can add Slither or Mythril to a GitHub Action and have it fail the build if issues are found. Foundry’s fuzz tests can be run as part of forge test every time. Invariant tests can even be run continuously in the cloud with tools like CloudExec, and any failure can automatically be converted into a unit test using fuzz-utils. These integrations mean developers get quick feedback and can iterate. In contrast, something like the Certora Prover historically was run as a separate process (or by an external auditing team) and might take hours to produce a result – not something you’d run on every commit. AI-based tools face integration hurdles too: calling an API and handling its output deterministically in CI is non-trivial. There’s also the matter of security and privacy – many organizations are uneasy about sending proprietary contract code to a third-party AI service for analysis. Self-hosted LLM solutions are not yet as powerful as the big cloud APIs, so this is a sticking point for CI adoption of AI auditors.

  • False Positives and Noise: A tool that reports many false positives or low-severity findings can reduce developer trust. Static analyzers have struggled with this in the past – e.g., early versions of some tools would flood users with warnings, many of which were irrelevant. The balance between signal and noise is crucial. Slither is praised for minimal false alarms, whereas a tool like Securify (in its research form) often produced many warnings that required manual filtering. LLMs, as discussed, can generate noise if not properly targeted. This is why currently AI suggestions are usually taken as advisory, not absolute. Teams are more likely to adopt a noisy tool if it’s run by a separate security team or in a audit context, but for day-to-day use, developers prefer tools that give clear, actionable results with low noise. The ideal is to “fail the build” only on definite bugs, not on hypotheticals. Achieving that reliability is an ongoing challenge, especially for AI tools.

  • Scalability and Performance: Formal verification can be computationally intensive. As noted, solvers might time out on complex contracts. This makes it hard to scale to large systems. If verifying one property takes hours, you won’t be checking dozens of properties on each code change. Fuzz testing also faces scalability limits – exploring a huge state space or a contract with many methods combinatorially can be slow (though in practice fuzzers can run in the background or overnight to deepen their search). AI models have fixed context sizes and increasing a model’s capacity is expensive. While GPT-4’s 128k-token context is a boon, feeding it hundreds of kilobytes of contract code is costly and still not enough for very large projects (imagine a complex protocol with many contracts – you might exceed that). There’s also multi-chain scaling: if your project involves interactions between contracts on different chains (Ethereum ↔ another chain), verifying or analyzing that cross-chain logic is even more complex and likely beyond current tooling.

  • Human Oversight and Verification: At the end of the day, most teams still rely on expert human auditors for final sign-off. Automated tools are aids, not replacements. One limitation of all these tools is that they operate within the bounds of known properties or patterns. They might miss a totally novel type of bug or a subtle economic flaw in a DeFi protocol’s design. Human auditors use intuition and experience to consider how an attacker might approach the system, sometimes in ways no tool is explicitly programmed to do. There have been cases where contracts passed all automated checks but a human later found a vulnerability in the business logic or a creative attack vector. Thus, a challenge is avoiding a false sense of security – developers must interpret the output of formal tools and AI correctly and not assume “no issues found” means the code is 100% safe.

  • Maintaining Specifications and Tests: For formal verification in particular, one practical issue is spec drift. The spec (invariants, rules, etc.) might become outdated as the code evolves. Ensuring the code and its formal spec remain in sync is a non-trivial management task. If developers are not vigilant, the proofs might “pass” simply because they’re proving something no longer relevant to the code’s actual requirements. Similarly, invariant tests must be updated as the system’s expected behavior changes. This requires a culture of invariant-driven development that not all teams have (though there is a push to encourage it).

In summary, the main limitations are people and process, rather than the raw capability of the technology. Formal and AI-assisted methods can greatly improve security, but they must be deployed in a way that fits developers’ workflows and skill sets. Encouragingly, trends like invariant-driven development (writing down critical invariants from day one) are gaining traction, and tooling is slowly improving to make advanced analyses more plug-and-play. The open-sourcing of major tools (e.g. Certora Prover) and the integration of fuzzing into popular frameworks (Foundry) are lowering barriers. Over time, we expect the gap between what “an average developer” can do and what “a PhD researcher” can do will narrow, in terms of using these powerful verification techniques.

Open-Source vs Commercial Auditing Tools

The landscape of smart contract security tools includes both open-source projects and commercial services. Each has its role, and often they complement each other:

  • Open-Source Tools: The majority of widely-used auditing tools in the Ethereum ecosystem are open-source. This includes Slither (static analyzer), Mythril (symbolic execution), Echidna (fuzzer), Manticore (symbolic/concolic execution), Securify (analyzer from ETH Zurich), Solhint/Ethlint (linters), and of course Foundry itself. Open-source tools are favored for a few reasons: (1) Transparency – developers can see how the tool works and trust that nothing malicious or hidden is occurring (important in an open ecosystem). (2) Community Contribution – rules and features get added by a broad community (Slither, for example, has many community-contributed detectors). (3) Cost – they are free to use, which is important for the many small projects/startups in Web3. (4) Integration – open tools can usually be run locally or in CI without legal hurdles, and often they can be customized or scripted for project-specific needs.

    Open-source tools have rapidly evolved. For instance, Slither’s support for new Solidity versions and patterns is continuously updated by Trail of Bits. Mythril, developed by ConsenSys, can analyze not just Ethereum but any EVM-compatible chain by working on bytecode – showing how open tools can be repurposed across chains easily. The downside of open tools is that they often come with “use at your own risk” – no official support or guarantees. They might not scale or have the polish of a commercial product’s UI. But in blockchain, even many companies use open-source as their core tools internally (sometimes with slight custom modifications).

  • Commercial Services and Tools: A few companies have offered security analysis as a product. Examples include MythX (a cloud-based scanning API by ConsenSys Diligence), Certora (which offered its prover as a service before open-sourcing it), CertiK and SlowMist (firms that have internal scanners and AI that they use in audits or offer via dashboards), and some newer entrants like Securify from ChainSecurity (the company was acquired and its tool used internally) or SolidityScan (a cloud scanner that outputs an audit report). Commercial tools often aim to provide a more user-friendly experience or integrated service. For example, MythX integrated with IDE extensions and CI plugins so that developers could send their contracts to MythX and get back a detailed report, including a vulnerability score and details to fix issues. These services typically run a combination of analysis techniques under the hood (pattern matching, symbolic execution, etc.) tuned to minimize false positives.

    The value proposition of commercial tools is usually convenience and support. They may maintain a continuously updated knowledge base of vulnerabilities and provide customer support in interpreting results. They might also handle heavy computation in the cloud (so you don’t need to run a solver locally). However, cost is a factor – many projects opt not to pay for these services, given the availability of free alternatives. Additionally, in the spirit of decentralization, some developers are hesitant to rely on closed-source services for security (the “verify, don’t trust” ethos). The open-sourcing of the Certora Prover in 2025 is a notable event: it turned what was a high-end commercial tool into a community resource. This move is expected to accelerate adoption, as now anyone can attempt to formally verify their contracts without a paid license, and the code openness will allow researchers to improve the tool or adapt it to other chains.

  • Human Audit Services: It’s worth mentioning that beyond software tools, many audits are done by professional firms (Trail of Bits, OpenZeppelin, Certik, PeckShield, etc.). These firms use a mix of the above tools (mostly open-source) and proprietary scripts. The outputs of these efforts are often kept private or only summarized in audit reports. There isn’t exactly an “open vs commercial” dichotomy here, since even commercial audit firms rely heavily on open-source tools. The differentiation is more in expertise and manual effort. That said, some firms are developing proprietary AI-assisted auditing platforms to give themselves an edge (for example, there were reports of “ChainGPT” being used for internal audits, or CertiK developing an AI called Skynet for on-chain monitoring). Those are not public tools per se, so their accuracy and methods are not widely documented.

In practice, a common pattern is open-source first, commercial optional. Teams will use open tools during development and testing (because they can integrate them easily and run as often as needed). Then, they might use a commercial service or two as an additional check before deployment – for instance, running a MythX scan to get a “second opinion” or hiring a firm that uses advanced tools like Certora to formally verify a critical component. With the lines blurring (Certora open source, MythX results often overlapping with open tools), the distinction is less about capability and more about support and convenience.

One notable cross-over is Certora’s multi-chain support – by supporting Solana and Stellar formally, they address platforms where fewer open alternatives exist (e.g. Ethereum has many open tools, Solana had almost none until recently). As security tools expand to new ecosystems, we may see more commercial offerings fill gaps, at least until open-source catches up.

Finally, it’s worth noting that open vs commercial is not adversarial here; they often learn from each other. For example, techniques pioneered in academic/commercial tools (like abstract interpretation used in Securify) eventually find their way into open tools, and conversely, data from open tool usage can guide commercial tool improvements. The end result sought by both sides is better security for the entire ecosystem.

Cross-Chain Applicability

While Ethereum has been the focal point for most of these tools (given its dominance in smart contract activity), the concepts of formal verification and AI-assisted auditing are applicable to other blockchain platforms as well. Here’s how they translate:

  • EVM-Compatible Chains: Blockchains like BSC, Polygon, Avalanche C-Chain, etc., which use the Ethereum Virtual Machine, can directly use all the same tools. A fuzz test or static analysis doesn’t care if your contract will deploy on Ethereum mainnet or on Polygon – the bytecode and source language (Solidity/Vyper) are the same. Indeed, Mythril and Slither can analyze contracts from any EVM chain by pulling the bytecode from an address (Mythril just needs an RPC endpoint). Many DeFi projects on these chains run Slither and Echidna just as an Ethereum project would. The audits of protocols on BSC or Avalanche typically use the identical toolkit as Ethereum audits. So, cross-chain in the EVM context mostly means reusing Ethereum’s toolchain.

  • Solana: Solana’s smart contracts are written in Rust (usually via the Anchor framework) and run on the BPF virtual machine. This is a very different environment from Ethereum, so Ethereum-specific tools don’t work out of the box. However, the same principles apply. For instance, one can do fuzz testing on Solana programs using Rust’s native fuzzing libraries or tools like cargo-fuzz. Formal verification on Solana had been nearly non-existent until recently. The collaboration between Certora and Solana engineers yielded an in-house verification tool for Solana programs that can prove Rust/Anchor contracts against specifications. As noted, Certora extended their prover to Solana’s VM, meaning developers can write rules about Solana program behavior and check them just like they would for Solidity. This is significant because Solana’s move-fast approach meant many contracts were launched without the kind of rigorous testing seen in Ethereum; formal tools could improve that. AI auditing for Solana is also plausible – an LLM that understands Rust could be prompted to check a Solana program for vulnerabilities (like missing ownership checks or arithmetic errors). It might require fine-tuning on Solana-specific patterns, but given Rust’s popularity, GPT-4 is fairly proficient at reading Rust code. We may soon see “ChatGPT for Anchor” style tools emerging as well.

  • Polkadot and Substrate-based Chains: Polkadot’s smart contracts can be written in Rust (compiled to WebAssembly) using the ink! framework, or you can run an EVM pallet (like Moonbeam does) which again allows Solidity. In the ink!/Wasm case, the verification tools are still nascent. One could attempt to formally verify a Wasm contract’s properties using general Wasm verification frameworks (for example, Microsoft’s Project Verona or others have looked at Wasm safety). Certora’s open-source prover also mentions support for Stellar’s WASM smart contracts, which are similar in concept to any Wasm-based chain. So it’s likely applicable to Polkadot’s Wasm contracts too. Fuzz testing ink! contracts can be done by writing Rust tests (property tests in Rust can serve a similar role). AI auditing of ink! contracts would entail analyzing Rust code as well, which again an LLM can do with the right context (though it might not know about the specific ink! macros or traits without some hints).

    Additionally, Polkadot is exploring the Move language for parallel smart contract development (as hinted in some community forums). If Move becomes used on Polkadot parachains, the Move Prover comes with it, bringing formal verification capabilities by design. Move’s emphasis on safety (resource oriented programming) and built-in prover shows a cross-chain propagation of formal methods from the start.

  • Other Blockchains: Platforms like Tezos (Michelson smart contracts) and Algorand (TEAL programs) each have had formal verification efforts. Tezos, for example, has a tool called Mi-Cho-Coq that provides a formal semantics of Michelson and allows proving properties. These are more on the academic side but show that any blockchain with a well-defined contract semantics can be subjected to formal verification. AI auditing could, in principle, be applied to any programming language – it’s a matter of training or prompting the LLM appropriately. For less common languages, an LLM might need some fine-tuning to be effective, as it may not have been pretrained on enough examples.

  • Cross-Chain Interactions: A newer challenge is verifying interactions across chains (like bridges or inter-chain messaging). Formal verification here might involve modeling multiple chains’ state and the communication protocol. This is very complex and currently beyond most tools, though specific bridge protocols have been manually analyzed for security. AI might help in reviewing cross-chain code (for instance, reviewing a Solidity contract that interacts with an IBC protocol on Cosmos), but there’s no out-of-the-box solution yet.

In essence, Ethereum’s tooling has paved the way for other chains. Many of the open-source tools are being adapted: e.g., there are efforts to create a Slither-like static analyzer for Solana (Rust), and the concepts of invariant testing are language-agnostic (property-based testing exists in many languages). The open-sourcing of powerful engines (like Certora’s for multiple VMs) is particularly promising for cross-chain security – the same platform could be used to verify a Solidity contract, a Solana program, and a Move contract, provided each has a proper specification written. This encourages a more uniform security posture across the industry.

It’s also worth noting that AI-assisted auditing will benefit all chains, since the model can be taught about vulnerabilities in any context. As long as the AI is provided with the right information (language specs, known bug patterns in that ecosystem), it can apply similar reasoning. For example, ChatGPT could be asked to audit a Solidity contract or a Move module with the appropriate prompt, and it will do both – it might even catch something like “this Move module might violate resource conservation” if it has that concept. The limitation is whether the AI was exposed to enough training data for that chain. Ethereum, being most popular, has likely biased the models to be best at Solidity. As other chains grow, future LLMs or fine-tuned derivatives could cover those as well.

Conclusion

智能合约形式化验证与 AI 辅助审计 is a rapidly evolving field. We now have a rich toolkit: from deterministic static analyzers and fuzzing frameworks that improve code reliability, to cutting-edge AI that can reason about code in human-like ways. Formal verification, once a niche academic pursuit, is becoming more practical through better tools and integration. AI, despite its current limitations, has shown glimpses of game-changing capabilities in automating security analysis.

There is no one-size-fits-all solution yet – real-world auditing combines multiple techniques to achieve defense in depth. Foundry’s fuzz and invariant testing are already raising the bar for what is expected before deployment (catching many errors that would slip past basic tests). AI-assisted auditing, when used carefully, can act as a force multiplier for auditors, highlighting issues and verifying compliance at a scale and speed that manual review alone cannot match. However, human expertise remains crucial to drive these tools, interpret their findings, and define the right properties to check.

Moving forward, we can expect greater convergence of these approaches. For example, AI might help write formal specifications or suggest invariants (“AI, what security properties should hold for this DeFi contract?”). Fuzzing tools might incorporate AI to guide input generation more intelligently (as PromFuzz does). Formal verification engines might use machine learning to decide which lemmas or heuristics to apply. All of this will contribute to more secure smart contracts across not just Ethereum, but all blockchain platforms. The ultimate vision is a future where critical smart contracts can be deployed with high confidence in their correctness – a goal that will likely be achieved by the synergistic use of formal methods and AI assistance, rather than either alone.

протSources:

  • Foundry fuzzing and invariant testing overview
  • Trail of Bits on fuzzing vs formal verification
  • Trail of Bits on formal verification limitations
  • Patrick Collins on fuzz/invariant vs formal methods
  • Trail of Bits on invariants in audits
  • Medium (BuildBear) on Slither and Mythril usage
  • AuditGPT results (ERC compliance)
  • Trail of Bits on LLM (Codex/GPT-4) limitations
  • LLM-SmartAudit performance vs traditional tools
  • “Detection Made Easy” study on ChatGPT accuracy
  • PromFuzz (LLM+fuzz) performance
  • Certora open-source announcement (multi-chain)
  • Move Prover description (Aptos)
  • Static analysis background (Smart contract security literature)

The Copy-Paste Crime: How a Simple Habit is Draining Millions from Crypto Wallets

· 5 min read
Dora Noda
Software Engineer

When you send crypto, what’s your routine? For most of us, it involves copying the recipient's address from our transaction history. After all, nobody can memorize a 40-character string like 0x1A2b...8f9E. It's a convenient shortcut we all use.

But what if that convenience is a carefully laid trap?

A devastatingly effective scam called Blockchain Address Poisoning is exploiting this exact habit. Recent research from Carnegie Mellon University has uncovered the shocking scale of this threat. In just two years, on the Ethereum and Binance Smart Chain (BSC) networks alone, scammers have made over 270 million attack attempts, targeting 17 million victims and successfully stealing at least $83.8 million.

This isn't a niche threat; it's one of the largest and most successful crypto phishing schemes operating today. Here’s how it works and what you can do to protect yourself.


How the Deception Works 🤔

Address poisoning is a game of visual trickery. The attacker’s strategy is simple but brilliant:

  1. Generate a Lookalike Address: The attacker identifies a frequent address you send funds to. They then use powerful computers to generate a new crypto address that has the exact same starting and ending characters. Since most wallets and block explorers shorten addresses for display (e.g., 0x1A2b...8f9E), their fraudulent address looks identical to the real one at a glance.

  2. "Poison" Your Transaction History: Next, the attacker needs to get their lookalike address into your wallet's history. They do this by sending a "poison" transaction. This can be:

    • A Tiny Transfer: They send you a minuscule amount of crypto (like $0.001) from their lookalike address. It now appears in your list of recent transactions.
    • A Zero-Value Transfer: In a more cunning move, they exploit a feature in many token contracts to create a fake, zero-dollar transfer that looks like it came from you to their lookalike address. This makes the fake address seem even more legitimate, as it appears you've sent funds there before.
    • A Counterfeit Token Transfer: They create a worthless, fake token (e.g., "USDTT" instead of USDT) and fake a transaction to their lookalike address, often mimicking the amount of a previous real transaction you made.
  3. Wait for the Mistake: The trap is now set. The next time you go to pay a legitimate contact, you scan your transaction history, see what you believe is the correct address, copy it, and hit send. By the time you realize your mistake, the funds are gone. And thanks to the irreversible nature of blockchain, there's no bank to call and no way to get them back.


A Glimpse into a Criminal Enterprise 🕵️‍♂️

This isn't the work of lone hackers. The research reveals that these attacks are carried out by large, organized, and highly profitable criminal groups.

Who They Target

Attackers don't waste their time on small accounts. They systematically target users who are:

  • Wealthy: Holding significant balances in stablecoins.
  • Active: Conducting frequent transactions.
  • High-Value Transactors: Moving large sums of money.

A Hardware Arms Race

Generating a lookalike address is a brute-force computational task. The more characters you want to match, the exponentially harder it gets. Researchers found that while most attackers use standard CPUs to create moderately convincing fakes, the most sophisticated criminal group has taken it to another level.

This top-tier group has managed to generate addresses that match up to 20 characters of a target's address. This feat is nearly impossible with standard computers, leading researchers to conclude they are using massive GPU farms—the same kind of powerful hardware used for high-end gaming or AI research. This shows a significant financial investment, which they easily recoup from their victims. These organized groups are running a business, and business is unfortunately booming.


How to Protect Your Funds 🛡️

While the threat is sophisticated, the defenses are straightforward. It all comes down to breaking bad habits and adopting a more vigilant mindset.

  1. For Every User (This is the most important part):

    • VERIFY THE FULL ADDRESS. Before you click "Confirm," take five extra seconds to manually check the entire address, character by character. Do not just glance at the first and last few digits.
    • USE AN ADDRESS BOOK. Save trusted, verified addresses to your wallet's address book or contact list. When sending funds, always select the recipient from this saved list, not from your dynamic transaction history.
    • SEND A TEST TRANSACTION. For large or important payments, send a tiny amount first. Confirm with the recipient that they have received it before sending the full sum.
  2. A Call for Better Wallets:

    • Wallet developers can help by improving user interfaces. This includes displaying more of the address by default or adding strong, explicit warnings when a user is about to send funds to an address they've only interacted with via a tiny or zero-value transfer.
  3. The Long-Term Fix:

    • Systems like the Ethereum Name Service (ENS), which allow you to map a human-readable name like yourname.eth to your address, can eliminate this problem entirely. Broader adoption is key.

In the decentralized world, you are your own bank, which also means you are your own head of security. Address poisoning is a silent but powerful threat that preys on convenience and inattention. By being deliberate and double-checking your work, you can ensure your hard-earned assets don't end up in a scammer's trap.

Ethereum's Anonymity Myth: How Researchers Unmasked 15% of Validators

· 6 min read
Dora Noda
Software Engineer

One of the core promises of blockchain technology like Ethereum is a degree of anonymity. Participants, known as validators, are supposed to operate behind a veil of cryptographic pseudonyms, protecting their real-world identity and, by extension, their security.

However, a recent research paper titled "Deanonymizing Ethereum Validators: The P2P Network Has a Privacy Issue" from researchers at ETH Zurich and other institutions reveals a critical flaw in this assumption. They demonstrate a simple, low-cost method to link a validator's public identifier directly to the IP address of the machine it's running on.

In short, Ethereum validators are not nearly as anonymous as many believe. The findings were significant enough to earn the researchers a bug bounty from the Ethereum Foundation, acknowledging the severity of the privacy leak.

How the Vulnerability Works: A Flaw in the Gossip

To understand the vulnerability, we first need a basic picture of how Ethereum validators communicate. The network consists of over a million validators who constantly "vote" on the state of the chain. These votes are called attestations, and they are broadcast across a peer-to-peer (P2PP2P) network to all other nodes.

With so many validators, having everyone broadcast every vote to everyone else would instantly overwhelm the network. To solve this, Ethereum’s designers implemented a clever scaling solution: the network is divided into 64 distinct communication channels, known as subnets.

  • By default, each node (the computer running the validator software) subscribes to only two of these 64 subnets. Its primary job is to diligently relay all messages it sees on those two channels.
  • When a validator needs to cast a vote, its attestation is randomly assigned to one of the 64 subnets for broadcast.

This is where the vulnerability lies. Imagine a node whose job is to manage traffic for channels 12 and 13. All day, it faithfully forwards messages from just those two channels. But then, it suddenly sends you a message that belongs to channel 45.

This is a powerful clue. Why would a node handle a message from a channel it's not responsible for? The most logical conclusion is that the node itself generated that message. This implies that the validator who created the attestation for channel 45 is running on that very machine.

The researchers exploited this exact principle. By setting up their own listening nodes, they monitored the subnets from which their peers sent attestations. When a peer sent a message from a subnet it wasn't officially subscribed to, they could infer with high confidence that the peer hosted the originating validator.

The method proved shockingly effective. Using just four nodes over three days, the team successfully located the IP addresses of over 161,000 validators, representing more than 15% of the entire Ethereum network.

Why This Matters: The Risks of Deanonymization

Exposing a validator's IP address is not a trivial matter. It opens the door for targeted attacks that threaten individual operators and the health of the Ethereum network as a whole.

1. Targeted Attacks and Reward Theft Ethereum announces which validator is scheduled to propose the next block a few minutes in advance. An attacker who knows this validator's IP address can launch a Denial-of-Service (DDoS) attack, flooding it with traffic and knocking it offline. If the validator misses its four-second window to propose the block, the opportunity passes to the next validator in line. If the attacker is that next validator, they can then claim the block rewards and valuable transaction fees (MEV) that should have gone to the victim.

2. Threats to Network Liveness and Safety A well-resourced attacker could perform these "sniping" attacks repeatedly, causing the entire blockchain to slow down or halt (a liveness attack). In a more severe scenario, an attacker could use this information to launch sophisticated network-partitioning attacks, potentially causing different parts of the network to disagree on the chain's history, thus compromising its integrity (a safety attack).

3. Revealing a Centralized Reality The research also shed light on some uncomfortable truths about the network's decentralization:

  • Extreme Concentration: The team found peers hosting a staggering number of validators, including one IP address running over 19,000. The failure of a single machine could have an outsized impact on the network.
  • Dependence on Cloud Services: Roughly 90% of located validators run on cloud providers like AWS and Hetzner, not on the computers of solo home stakers. This represents a significant point of centralization.
  • Hidden Dependencies: Many large staking pools claim their operators are independent. However, the research found instances where validators from different, competing pools were running on the same physical machine, creating hidden systemic risks.

Mitigations: How Can Validators Protect Themselves?

Fortunately, there are ways to defend against this deanonymization technique. The researchers proposed several mitigations:

  • Create More Noise: A validator can choose to subscribe to more than two subnets—or even all 64. This makes it much harder for an observer to distinguish between relayed messages and self-generated ones.
  • Use Multiple Nodes: An operator can separate validator duties across different machines with different IPs. For example, one node could handle attestations while a separate, private node is used only for proposing high-value blocks.
  • Private Peering: Validators can establish trusted, private connections with other nodes to relay their messages, obscuring their true origin within a small, trusted group.
  • Anonymous Broadcasting Protocols: More advanced solutions like Dandelion, which obfuscates a message's origin by passing it along a random "stem" before broadcasting it widely, could be implemented.

Conclusion

This research powerfully illustrates the inherent trade-off between performance and privacy in distributed systems. In its effort to scale, Ethereum's P2PP2P network adopted a design that compromised the anonymity of its most critical participants.

By bringing this vulnerability to light, the researchers have given the Ethereum community the knowledge and tools needed to address it. Their work is a crucial step toward building a more robust, secure, and truly decentralized network for the future.

Secure Deployment with Docker Compose + Ubuntu

· 6 min read

In Silicon Valley startups, Docker Compose is one of the preferred tools for quickly deploying and managing containerized applications. However, convenience often comes with security risks. As a Site Reliability Engineer (SRE), I am well aware that security vulnerabilities can lead to catastrophic consequences. This article will share the best security practices I have summarized in my actual work combining Docker Compose with Ubuntu systems, helping you enjoy the convenience of Docker Compose while ensuring system security.

Secure Deployment with Docker Compose + Ubuntu

I. Hardening Ubuntu System Security

Before deploying containers, it is crucial to ensure the security of the Ubuntu host itself. Here are some key steps:

1. Regularly Update Ubuntu and Docker

Ensure that both the system and Docker are kept up-to-date to fix known vulnerabilities:

sudo apt update && sudo apt upgrade -y
sudo apt install docker-ce docker-compose-plugin

2. Restrict Docker Management Permissions

Strictly control Docker management permissions to prevent privilege escalation attacks:

sudo usermod -aG docker deployuser
# Prevent regular users from easily obtaining docker management permissions

3. Configure Ubuntu Firewall (UFW)

Reasonably restrict network access to prevent unauthorized access:

sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable
sudo ufw status verbose

4. Properly Configure Docker and UFW Interaction

By default, Docker bypasses UFW to configure iptables, so manual control is recommended:

Modify the Docker configuration file:

sudo nano /etc/docker/daemon.json

Add the following content:

{
"iptables": false,
"ip-forward": true,
"userland-proxy": false
}

Restart the Docker service:

sudo systemctl restart docker

Explicitly bind addresses in Docker Compose:

services:
webapp:
ports:
- "127.0.0.1:8080:8080"

II. Docker Compose Security Best Practices

The following configurations apply to Docker Compose v2.4 and above. Note the differences between non-Swarm and Swarm modes.

1. Restrict Container Permissions

Containers running as root by default pose high risks; change to non-root users:

services:
app:
image: your-app:v1.2.3
user: "1000:1000" # Non-root user
read_only: true # Read-only filesystem
volumes:
- /tmp/app:/tmp # Mount specific directories if write access is needed
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE

Explanation:

  • A read-only filesystem prevents tampering within the container.
  • Ensure mounted volumes are limited to necessary directories.

2. Network Isolation and Port Management

Precisely divide internal and external networks to avoid exposing sensitive services to the public:

networks:
frontend:
internal: false
backend:
internal: true

services:
nginx:
networks: [frontend, backend]
database:
networks:
- backend
  • Frontend network: Can be open to the public.
  • Backend network: Strictly restricted, internal communication only.

3. Secure Secrets Management

Sensitive data should never be placed directly in Compose files:

In single-machine mode:

services:
webapp:
environment:
- DB_PASSWORD_FILE=/run/secrets/db_password
volumes:
- ./secrets/db_password.txt:/run/secrets/db_password:ro

In Swarm mode:

services:
webapp:
secrets:
- db_password
environment:
DB_PASSWORD_FILE: /run/secrets/db_password

secrets:
db_password:
external: true # Managed through Swarm's built-in management

Note:

  • Docker's native Swarm Secrets cannot directly use external tools like Vault or AWS Secrets Manager.
  • If external secret storage is needed, integrate the reading process yourself.

4. Resource Limiting (Adapt to Docker Compose Version)

Container resource limits prevent a single container from exhausting host resources.

Docker Compose Single-Machine Mode (v2.4 recommended):

version: '2.4'

services:
api:
image: your-image:1.4.0
mem_limit: 512m
cpus: 0.5

Docker Compose Swarm Mode (v3 and above):

services:
api:
deploy:
resources:
limits:
cpus: "0.5"
memory: 512M
reservations:
cpus: "0.25"
memory: 256M

Note: In non-Swarm environments, the deploy section's resource limits do not take effect, be sure to pay attention to the Compose file version.

5. Container Health Checks

Set up health checks to proactively detect issues and reduce service downtime:

services:
webapp:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 20s

6. Avoid Using the Latest Tag

Avoid the uncertainty brought by the latest tag in production environments, enforce specific image versions:

services:
api:
image: your-image:1.4.0

7. Proper Log Management

Prevent container logs from exhausting disk space:

services:
web:
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "5"

8. Ubuntu AppArmor Configuration

By default, Ubuntu enables AppArmor, and it is recommended to check the Docker profile status:

sudo systemctl enable --now apparmor
sudo aa-status

Docker on Ubuntu defaults to enabling AppArmor without additional configuration. It is generally not recommended to enable SELinux on Ubuntu simultaneously to avoid conflicts.

9. Continuous Updates and Security Scans

  • Image Vulnerability Scanning: It is recommended to integrate tools like Trivy, Clair, or Snyk in the CI/CD process:
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy image your-image:v1.2.3
  • Automated Security Update Process: Rebuild images at least weekly to fix known vulnerabilities.

III. Case Study: Lessons from Docker Compose Configuration Mistakes

In July 2019, Capital One suffered a major data breach affecting the personal information of over 100 million customers 12. Although the main cause of this attack was AWS configuration errors, it also involved container security issues similar to your described situation:

  1. Container Permission Issues: The attacker exploited a vulnerability in a Web Application Firewall (WAF) running in a container but with excessive permissions.
  2. Insufficient Network Isolation: The attacker could access other AWS resources from the compromised container, indicating insufficient network isolation measures.
  3. Sensitive Data Exposure: Due to configuration errors, the attacker could access and steal a large amount of sensitive customer data.
  4. Security Configuration Mistakes: The root cause of the entire incident was the accumulation of multiple security configuration errors, including container and cloud service configuration issues.

This incident resulted in significant financial losses and reputational damage for Capital One. It is reported that the company faced fines of up to $150 million due to this incident, along with a long-term trust crisis. This case highlights the importance of security configuration in container and cloud environments, especially in permission management, network isolation, and sensitive data protection. It reminds us that even seemingly minor configuration errors can be exploited by attackers, leading to disastrous consequences.

IV. Conclusion and Recommendations

Docker Compose combined with Ubuntu is a convenient way to quickly deploy container applications, but security must be integrated throughout the entire process:

  • Strictly control container permissions and network isolation.
  • Avoid sensitive data leaks.
  • Regular security scanning and updates.
  • It is recommended to migrate to advanced orchestration systems like Kubernetes for stronger security assurance as the enterprise scales.

Security is a continuous practice with no endpoint. I hope this article helps you better protect your Docker Compose + Ubuntu deployment environment.