96% of Blockchain RPC Calls Are Reads—Are We Building Expensive Databases?

I’ve been digging through infrastructure data lately, and I came across something from Syndica that made me question a lot of assumptions about blockchain architecture. According to their analysis spanning over 2 years of production data across various DApps, 96.1% of all calls made to a Solana node are reads, not writes.

Let me repeat that: ninety-six percent reads.

The Database Question

As a data engineer, this immediately made me think: if blockchain usage is overwhelmingly read-heavy, are we essentially building really expensive, really slow distributed databases with consensus overhead?

Traditional databases have been optimized for decades around read/write patterns. We have read replicas, caching layers, materialized views—the whole works. But blockchain infrastructure has historically focused almost entirely on transaction throughput (writes) and consensus mechanisms. We measure TPS, finality time, and block production rates. But what about reads-per-second (RPS)?

The Infrastructure Mismatch

Think about what most blockchain infrastructure optimizes for:

  • Validators: Optimized for transaction processing and consensus
  • Benchmarks: TPS, finality time, throughput
  • Scaling solutions: Layer 2s, sharding, parallel execution—all focused on write capacity

Meanwhile, the actual usage pattern is:

  • Price feed queries for DeFi protocols
  • Account balance checks for wallets
  • Transaction history lookups
  • State queries for dApps
  • Analytics and monitoring

96% of the time, we’re just reading data.

Syndica’s Sig: Rethinking the Architecture

This is why Syndica’s approach with Sig caught my attention. They’re building a Solana validator client from scratch in Zig, specifically optimized for reads-per-second instead of transactions-per-second.

Early benchmarks show 50-70% performance improvements compared to existing solutions. They’re focusing on optimizing getProgramAccounts and other read-heavy queries that hammer RPC nodes constantly.

This feels like the kind of paradigm shift that’s obvious in retrospect: optimize for what people actually do, not what the whitepaper said they’d do.

So, Are We Just Building Databases?

Here’s where it gets philosophical. If 96% of blockchain operations are reads, and writes are the minority use case, what makes blockchains valuable?

I’d argue it’s the quality of those writes, not the quantity. That 4% of writes creates an immutable, verifiable history that makes the 96% of reads trustworthy. You’re not querying a database administrator—you’re querying cryptographic proof.

But from an infrastructure perspective, we’ve been overinvesting in write capacity and underinvesting in read optimization.

The Data Engineering Parallel

In traditional data engineering, we separate OLTP (transactional) from OLAP (analytical) workloads. We write to one system and read from another. We use data warehouses, read replicas, and caching layers.

Maybe blockchain infrastructure needs a similar split:

  • Consensus layer: Optimized for secure, fast writes (the 4%)
  • Data availability layer: Optimized for fast, scalable reads (the 96%)

Platforms like GetBlock and other RPC providers are already doing this—they’re essentially building read-optimized infrastructure on top of write-optimized blockchains.

Questions for the Community

  1. Should we rethink how we architect dApps around this 96/4 read/write split?
  2. Are read-optimized clients like Sig the future, or are we just patching over fundamental design issues?
  3. What does this mean for decentralization? If reads are centralized through RPC providers, does it matter that writes are decentralized?
  4. How should we price RPC services when read operations dominate usage?

I’m genuinely curious what others think. Are we building blockchains or just really expensive databases with really good audit logs?


Sources:

Mike, this is a great data-driven observation, but I think the framing misses the fundamental point of what makes blockchains valuable.

The Database Comparison Is a Red Herring

Yes, 96% of operations are reads. But comparing blockchains to databases based solely on read/write ratios is like comparing a car to a bicycle based on their wheel count. The reads-per-second metric doesn’t capture what makes blockchain architecture fundamentally different.

What you’re reading from a blockchain:

  • Cryptographically verified, immutable history
  • Trustless state that doesn’t require trusting a database administrator
  • Consensus-backed data that can’t be unilaterally changed
  • Publicly auditable transactions that anyone can verify

What you’re reading from a traditional database:

  • Whatever the database owner wants you to see
  • Data that can be modified, deleted, or rolled back
  • State controlled by a centralized authority
  • No cryptographic proof of authenticity

The 96% of reads only have value because of the 4% of writes and the consensus mechanism that validates them. You’re not paying for read performance—you’re paying for trustlessness.

Read Optimization Is Still Smart Engineering

That said, I completely agree that Syndica’s approach with Sig is the right direction. Optimizing for actual usage patterns is just good engineering, whether you’re building databases or blockchains.

The Solana validator client was originally designed to maximize transaction throughput and consensus participation. That’s the right optimization for validators participating in consensus. But RPC nodes serving dApp queries have different requirements.

Building a read-optimized validator client doesn’t mean we’re “admitting blockchains are databases.” It means we’re acknowledging that different infrastructure components have different performance profiles:

  • Consensus validators: Write-optimized, focus on transaction throughput
  • RPC nodes: Read-optimized, focus on query performance
  • Archival nodes: Storage-optimized, focus on historical data access

This specialization is healthy. It’s similar to how Ethereum separated execution clients from consensus clients—different responsibilities, different optimization goals.

The Decentralization Question Is the Real Issue

Where I do share your concern is point #3: what does this mean for decentralization?

If 96% of user interactions with blockchains go through centralized RPC providers (Alchemy, Infura, QuickNode, BlockEden), does it matter that the 4% of writes are decentralized?

This is where I think the industry needs to be honest with itself. We’ve achieved decentralized consensus but we haven’t achieved decentralized access. Most users interact with Ethereum or Solana through a handful of RPC providers, creating:

  • Single points of failure (Infura outages break dApps)
  • Censorship risks (RPC providers can filter transactions)
  • Privacy concerns (providers see all user queries)

Light clients and local nodes solve this, but the UX is terrible and most users won’t run them. This is a harder problem than optimizing read performance.

My Take

Blockchains aren’t expensive databases—they’re consensus machines that happen to expose a read-heavy API. Optimizing for reads is smart. But the real challenge isn’t infrastructure performance—it’s decentralizing user access to that infrastructure.

Sig’s read optimization is valuable. But the bigger problem is that 96% of those reads go through centralized chokepoints.

Mike and Brian both make excellent points, but I want to throw the business angle into this discussion because it has massive implications for how we build and price Web3 infrastructure.

The 96/4 Split Creates a Pricing Problem

Here’s what keeps me up at night as someone building a Web3 startup: if 96% of my infrastructure costs come from reads, but most blockchain pricing models are optimized around writes (gas fees, transaction costs), who’s subsidizing the reads?

RPC providers are. And they’re charging us for it.

We’re paying BlockEden, Alchemy, or QuickNode based on:

  • Request volume (reads)
  • Compute units consumed (mostly reads)
  • Archive data access (historical reads)

Meanwhile, on-chain costs are almost entirely write-based. This creates a weird economic mismatch where the most expensive part of running a dApp (RPC access) is completely off-chain and centralized.

The Real Cost of “Free” Reads

When Syndica says 96% of operations are reads, that translates directly to our infrastructure budget. Last month our RPC costs were 3x our on-chain gas fees. We’re spending more on reading blockchain data than writing to it.

This raises strategic questions:

  1. Should we run our own nodes? (We calculated it—not worth it until we hit 10x current scale)
  2. Should we cache aggressively? (Yes, but that introduces UX lag and complexity)
  3. Should we accept centralization risk for reads? (Currently yes, because there’s no viable alternative)

The startup playbook for Web2 is clear: use cloud providers, optimize costs as you scale, own your infrastructure at massive scale. But in Web3, running your own blockchain node doesn’t make economic sense until you’re huge, so everyone’s stuck on third-party RPC providers.

User Experience Implications

Brian’s point about decentralization is spot-on, but there’s also a UX angle. Users don’t care about reads vs writes—they care about speed.

When someone swaps on our DEX:

  • They need instant price quotes (read)
  • They need real-time balance updates (read)
  • They need transaction confirmation (write + read)
  • They need updated balances (read)

That’s 3-4 reads for every 1 write. If RPC latency is 500ms and we need 4 round-trips, that’s 2 seconds of lag before a transaction even gets submitted. Syndica’s 50-70% read performance improvement isn’t just a technical win—it’s a direct UX improvement.

The Business Opportunity

Here’s what I think the 96% read stat reveals: there’s a huge market opportunity for read-optimized blockchain infrastructure.

Imagine:

  • Read-optimized L2s: Ultra-fast state queries, slower writes
  • Specialized data layers: Optimized for analytics and dashboards
  • Cached state networks: Near-instant reads with eventual consistency
  • Read pricing models: Pay per read, not just per write

Right now we’re forcing everything through a one-size-fits-all model where consensus validators also serve RPC queries. That’s like asking your production database to also run analytics—nobody does that in Web2 for a reason.

My Take

As a founder, the 96/4 split tells me:

  1. RPC providers have pricing power because we have no choice
  2. Read optimization is underinvested relative to its importance
  3. There’s a business opportunity in specialized read infrastructure
  4. Current decentralization is theater if reads are centralized

Sig is great, but I want to see more innovation in the business model layer too. Can we get decentralized RPC networks with economic incentives? Can we separate read pricing from write pricing? Can we build infrastructure that acknowledges this 96/4 reality instead of pretending every operation is a transaction?

The technical architecture should follow the economics, not the other way around.

This discussion is fascinating from an architecture perspective, but I want to talk about what this means for how we teach developers to build smart contracts and dApps.

We’re Teaching the Wrong Patterns

When I teach Solidity workshops, we focus heavily on:

  • Gas optimization for writes
  • Storage vs memory usage
  • Minimizing transaction costs
  • State mutation patterns

But if 96% of operations are reads, we should be teaching:

  • State design optimized for queries
  • View function architecture
  • Event emission strategies for off-chain indexing
  • Caching and read-path optimization

We’re optimizing for the 4% and ignoring the 96%.

Smart Contract Design for Read-Heavy Workloads

Think about a typical ERC-20 token contract. We obsess over making transfer() gas-efficient, but how often do we think about optimizing balanceOf() queries?

Or consider a DEX contract:

  • Writes: addLiquidity, swap, removeLiquidity (expensive, rare)
  • Reads: getReserves, getAmountOut, getPair (cheap, constant)

Yet most contract tutorials focus entirely on the write paths. We barely mention that most users will be calling view functions 100x more often.

The getProgramAccounts Problem

Mike mentioned Syndica optimizing for getProgramAccounts on Solana. On Ethereum, we have a similar pattern with event logs and filters.

Bad pattern (but common):
Smart contracts that require iterating through all accounts or events to build state. This kills RPC performance and forces developers to run their own indexers.

Better pattern:
Design contracts with read-optimized data structures:

  • Use mappings with predictable keys
  • Emit events with indexed parameters for efficient filtering
  • Consider using sparse Merkle trees for large datasets
  • Keep frequently-queried data in easily accessible storage slots

But here’s the problem: writing read-optimized contracts often increases write costs. More indexed events = higher gas. More organized storage = more SSTORE operations.

So we have a tradeoff:

  • Optimize for writes = cheap transactions, slow reads
  • Optimize for reads = expensive transactions, fast queries

Given that writes are 4% and reads are 96%, maybe we’ve been optimizing the wrong side?

The Developer Experience Gap

Steve mentioned that RPC costs exceed gas costs 3x. From a developer perspective, that’s wild because:

  • We have tons of tools for gas profiling (Hardhat, Foundry gas reporters)
  • We have no tools for profiling RPC read efficiency
  • Gas optimization is a badge of honor in the dev community
  • Read optimization is barely discussed

I’ve never seen a PR review comment that says “this contract design will hammer RPC nodes with expensive queries.” But I’ve seen hundreds about shaving 2000 gas off a transaction.

What Should We Do Differently?

If I were redesigning how we teach smart contract development:

  1. Teach read-path design first: How will users query this data? Then design the write path around that.

  2. Profile RPC impact, not just gas: Create tools that estimate RPC query costs for different contract designs.

  3. Use indexers as first-class infrastructure: Stop treating The Graph or other indexers as nice-to-haves. Design contracts to emit events optimized for indexing.

  4. Separate read and write concerns: Maybe we need read-optimized state channels or data layers that complement write-optimized contracts.

The Sig Implications for Developers

If read-optimized validator clients like Sig become standard, it changes the calculation. Contract designs that were “too expensive to query” might become viable.

For example:

  • Large array iterations (currently a no-go)
  • Complex view functions with nested calls
  • Historical state queries without events

But we shouldn’t rely on infrastructure improvements to fix bad contract design. We should design contracts that work well with current infrastructure and get better with Sig, not contracts that only work if Sig exists.

My Take

As developers, we’ve been cargo-culting gas optimization without thinking about the full picture. The 96/4 read/write split should fundamentally change how we architect smart contracts.

Security first, read-optimization second, write-optimization third.

Most of us have that priority backwards.

Coming at this from the DeFi trenches, the 96% read stat isn’t just a technical curiosity—it’s our daily operational reality. Let me share what this looks like when you’re actually running protocols and bots.

DeFi Is Basically a Read-Heavy Database with Expensive Writes

Here’s what our yield optimization bot does in a typical cycle:

Reads (every 2-5 seconds):

  • Query current pool reserves (20+ pools)
  • Check token prices across DEXs
  • Calculate optimal routes
  • Monitor gas prices
  • Check account balances
  • Read pending transactions in mempool
  • Verify slippage conditions

Writes (maybe once per minute if we find an opportunity):

  • Execute swap
  • Claim rewards
  • Rebalance position

We’re doing 100+ reads for every 1 write. And every millisecond of read latency costs us money because arbitrage opportunities close fast.

RPC Performance Is Our Biggest Bottleneck

Steve mentioned RPC costs exceeding gas fees 3x. For us it’s worse—our RPC bills are 5x our gas costs. And it’s not just about money, it’s about speed.

Real numbers from last week:

  • RPC latency: 200-500ms per call
  • Need: 15+ calls to evaluate one arbitrage opportunity
  • Window: Opportunities close in 1-3 seconds
  • Result: We miss 60-70% of opportunities due to read latency

If Syndica’s Sig client can deliver 50-70% faster reads, that’s not incremental—that’s potentially 2-3x more profitable trades. Read performance directly translates to revenue.

The Read Patterns DeFi Creates

Sarah’s point about contract design is spot-on, but DeFi protocols create specific read patterns that hammer RPC nodes:

Liquidity Pools:

  • Constant getReserves calls from every trading bot
  • Real-time price feed queries
  • Liquidity depth monitoring

Lending Protocols:

  • Continuous health factor checks
  • Interest rate updates
  • Collateral price monitoring
  • Liquidation opportunity scanning

Yield Aggregators:

  • APY calculations across dozens of protocols
  • Gas cost estimation
  • Optimal strategy selection

All of this is reads. Constant, high-frequency reads. And if your RPC provider rate-limits you or goes down, your bot stops making money.

The Centralization Problem Is Real

Brian’s concern about centralized RPC access hits hard in DeFi. Here’s what actually happens:

Infura outage in 2022: Half the DeFi ecosystem stopped working. Prices went stale, liquidations failed, arbitrage bots died. If you weren’t running your own nodes, you were offline.

RPC provider rate limits: We’ve hit rate limits during high volatility (exactly when we need speed most). Premium plans help, but we’re still at the mercy of centralized providers.

MEV and RPC providers: When your RPC provider sees all your transactions before they hit the mempool, do they front-run you? Maybe not directly, but the information asymmetry is real.

Why We Can’t Just Run Our Own Nodes

Everyone says “run your own node” but here’s the reality:

Costs:

  • Archive nodes: 4-8TB storage, expensive hardware
  • Multiple chains: Ethereum + Polygon + Arbitrum + Base = 4x infrastructure
  • DevOps time: Monitoring, upgrades, sync issues
  • Redundancy: Can’t rely on single node, need backups

We calculated it: Running reliable nodes across chains costs 50K+ per year. RPC providers are cheaper until you’re massive scale.

But then we’re stuck with centralized reads for decentralized protocols.

What Would Actually Help

From a DeFi operator perspective, here’s what would move the needle:

  1. Read-optimized RPC networks: Sig is great for Solana. We need equivalent for Ethereum, Arbitrum, Base, etc.

  2. Decentralized RPC with economic incentives: Projects like Pocket Network exist but adoption is low. We need this to become standard.

  3. Better caching infrastructure: Reads don’t need to hit validators every time. Smart caching layers could massively reduce latency and costs.

  4. Read-path L2s: Maybe we need separate networks optimized for state queries, with eventual consistency to mainnet.

  5. Standardized read APIs: Different RPC providers have different quirks. We need consistency.

The Uncomfortable Truth

Mike asked if we’re building databases instead of blockchains. From where I sit, DeFi already treats blockchains like databases—we’re just stuck with really slow, really expensive database queries.

The 4% of writes create the trustless state. The 96% of reads are how we actually use that state. And right now, the read infrastructure is:

  • Centralized (handful of RPC providers)
  • Expensive (costs more than gas)
  • Slow (hundreds of milliseconds)
  • Fragile (single points of failure)

Sig and similar read optimizations aren’t solving a theoretical problem—they’re solving the actual bottleneck that every DeFi protocol and bot faces every single day.

The blockchain revolution gave us trustless writes. Now we need trustless, fast, decentralized reads to match.