As Web3 moves from retail experimentation to institutional adoption, the indexing infrastructure requirements change dramatically. The hobbyist-friendly subgraph that worked for your hackathon project will not survive an institutional audit, and the performance tolerance that retail users accept will get you fired from an enterprise contract.
I have spent the past year building data infrastructure for a protocol that serves both retail and institutional users. Here is what I have learned about what “institutional-grade” actually means for indexing, and how the current landscape of indexing platforms measures up.
What Institutional Clients Actually Require
When an institution evaluates your dApp’s data infrastructure, they are looking at dimensions that most crypto-native builders do not think about:
1. Service Level Agreements (SLAs)
Institutional clients want contractual guarantees around:
- Uptime: 99.95% or higher (that is less than 4.4 hours of downtime per year)
- Query latency: p95 under 100ms, p99 under 500ms
- Data freshness: Maximum lag of N seconds from on-chain confirmation to queryable data
- Error rates: Less than 0.1% of queries returning errors
Try getting these guarantees from The Graph’s decentralized network. The network is designed for censorship resistance, not SLA compliance. Individual indexers have no contractual relationship with the developers querying their data.
Ormi and Goldsky can offer SLAs because they are centralized services with dedicated infrastructure. This is one of the clearest arguments for centralized indexing in institutional contexts.
2. Data Correctness and Audit Trails
Institutional applications need to prove that the data they serve is correct. This means:
- Deterministic indexing: The same blockchain state must always produce the same indexed result
- Audit trails: Every piece of indexed data should be traceable back to its source block and transaction
- Reconciliation reports: Regular automated checks comparing indexed data against direct chain state
- Version control: Changes to indexing logic must be tracked, reviewed, and reversible
The Graph’s subgraph model provides some of this through its deterministic execution environment. But the decentralized network adds uncertainty because different indexers might be at different sync states, producing slightly different results for the same query at the same time.
For our institutional deployment, we ended up running our own dedicated Graph node (not using the decentralized network) alongside an Ormi backup. This gives us deterministic indexing with full control over the execution environment.
3. Disaster Recovery and Business Continuity
What happens when your indexing provider goes down? For retail users, a “please try again later” message is annoying. For an institutional trading desk, it could mean missed trades, compliance violations, or regulatory penalties.
Our disaster recovery strategy involves:
- Primary: Ormi for sub-30ms production queries
- Secondary: Self-hosted Graph node with identical subgraphs, synced in real-time
- Tertiary: Direct RPC fallback for critical-path queries (slower but always available)
- Data store: Goldsky Mirror feeding a PostgreSQL replica that can serve cached data even if all indexers are down
This multi-provider approach is expensive and complex, but it gives us the five-nines reliability that institutional clients demand.
4. Compliance and Data Governance
Institutions need to answer questions like:
- Where is the indexed data physically stored? (data residency requirements)
- Who has access to query logs? (privacy regulations)
- Can the indexing provider be compelled to stop serving data? (regulatory risk)
- How is personally identifiable information handled? (GDPR, CCPA)
The Graph’s decentralized network actually scores well on some of these: no single entity controls the data, and query logs are distributed. But it scores poorly on data residency (you cannot guarantee which indexer in which jurisdiction serves your query).
Centralized providers like Ormi can offer data residency guarantees, SOC 2 compliance, and detailed access controls. These matter for institutional adoption.
5. Performance Under Load
Institutional applications often have bursty traffic patterns. A market event can spike query volume 10-100x within seconds. Your indexing infrastructure needs to handle these spikes without degradation.
Our load testing revealed significant differences:
- Ormi: Handled our 10x spike test (40,000 RPS) with minimal latency increase. Their infrastructure appears to auto-scale.
- The Graph decentralized: Performance degraded significantly under spike load. Indexer selection became unreliable, and many queries timed out.
- Self-hosted Graph node: Predictable performance limited by our hardware. We provision for 3x normal load, which means 10x spikes cause degradation.
- Goldsky Mirror: Since reads are against our own database, spike handling depends on our database infrastructure. PostgreSQL with read replicas handles spikes well.
The Architecture We Settled On
After months of evaluation, here is our production architecture:
On-Chain Events
|
v
Goldsky Mirror --> PostgreSQL (analytics, reporting, cached reads)
|
v
Ormi Subgraphs --> Primary query endpoint (real-time, SLA-backed)
|
v
Self-hosted Graph Node --> Fallback endpoint (disaster recovery)
|
v
Direct RPC --> Emergency fallback (always available, slower)
Each layer serves a different purpose:
- Goldsky Mirror for continuous data replication and analytics
- Ormi for production query serving with performance SLAs
- Self-hosted Graph Node for independent verification and disaster recovery
- Direct RPC as the last resort that can never go down (as long as the chain is alive)
Cost Breakdown
For transparency, here is what this costs us monthly:
- Ormi: ~$1,200/month (production query volume)
- Goldsky Mirror: ~$800/month (continuous streaming)
- Self-hosted Graph Node: ~$400/month (VPS + storage)
- Direct RPC: ~$300/month (archive node access through BlockEden)
- Total: ~$2,700/month
This sounds expensive until you compare it to traditional financial data infrastructure. Bloomberg Terminal subscriptions start at $2,000/month per seat. Institutional-grade market data feeds cost tens of thousands monthly. In context, $2,700 for a multi-layered, resilient blockchain data infrastructure is remarkably affordable.
Lessons for Builders Targeting Institutional Users
- Design for failure from day one. Every component will fail. Your architecture must degrade gracefully.
- Separate your read path from your write path. Reads should never depend on a single provider.
- Invest in reconciliation tooling. Automated data validation between your indexed data and chain state is not optional for institutional use.
- Document everything. Institutional clients want architecture diagrams, runbooks, and incident response procedures.
- Plan for multi-chain expansion. Today it is Ethereum and two L2s. In six months, your institutional client will want Solana, Cosmos, and whatever chain their newest fund is targeting.
The indexing wars are creating better infrastructure for everyone. But institutional-grade applications need to think beyond “which platform is cheapest” and design for the resilience, compliance, and performance standards that serious capital demands.