A Data-Driven Look at Every Major Solana Outage
I have compiled a comprehensive analysis of every major Solana network disruption since 2022, because understanding the failure modes of the past is the best way to evaluate whether Firedancer actually solves the right problems. The data tells a story that is more nuanced than either the critics or the defenders want to admit.
The Outage Timeline
2022: The Year of Growing Pains
January 2022 – Consensus Stall (Congestion-Induced)
The network experienced severe congestion between January 6-12, leading to degraded performance and partial outages. Under heavy load, vote transactions (critical for consensus) were being crowded out by regular transactions. Without confirmed votes, consensus stalled and block production halted. This was fundamentally a resource prioritization bug – the client did not differentiate between consensus-critical and user-submitted transactions under extreme load.
April 2022 – NFT Bot Surge (6M RPS)
This was the most dramatic failure. Some nodes reported receiving six million requests per second, generating over 100 Gbps of traffic per node. The trigger was bots competing for NFT mints through Metaplex Candy Machine. Validators ran out of memory and crashed sequentially, stalling consensus. This was simultaneously a networking failure (inability to handle traffic spikes), a resource management failure (no memory limits), and an economic design failure (no cost to spam).
Late 2022 – Duplicate Block Production
A backup “hot-spare” validator began producing duplicate blocks at the same height. A bug in fork selection logic prevented other validators from correctly resolving the fork, halting consensus. This was a classic distributed systems bug – the kind that only manifests in specific operational configurations that are difficult to test for.
2023: Stability Improves, But…
February 2023 – Oversized Block Propagation
A malfunctioning validator broadcast an unusually large block that overwhelmed Solana’s Turbine block propagation protocol. This cascaded into a network-wide outage. The fix required protocol-level changes to how blocks are shredded and distributed. This one is particularly relevant to the Firedancer discussion because Firedancer’s Turbine implementation is independent – it could have potentially avoided this specific failure mode.
2024: The Last Straw?
February 2024 – LoadedPrograms Bug (5-Hour Outage)
A bug in the LoadedPrograms function caused validators to crash. The roughly five-hour outage required a coordinated validator restart. This is the most concerning outage from a client diversity perspective because it was a consensus-critical execution bug – exactly the kind of failure that an independent client should catch. But because Frankendancer shares Agave’s execution runtime, it would have been affected too.
What Pattern Do These Outages Reveal?
Looking at the data, Solana’s outages fall into three categories:
| Category | Outages | Would Firedancer Help? |
|---|---|---|
| Networking/Traffic handling | Jan 2022, Apr 2022 | Yes – Firedancer’s kernel bypass and custom QUIC stack directly address these |
| Consensus/Fork selection | Late 2022, Feb 2023 | Partially – Independent implementation might diverge on edge cases |
| Execution runtime bugs | Feb 2024 | No (Frankendancer) / Yes (Full Firedancer) |
This is the critical insight: Firedancer solves the right problems for 2022-era Solana, but the network’s failure modes have evolved. The networking-layer outages have been largely addressed through Agave improvements independent of Firedancer. The remaining risks are in the execution and consensus layers – precisely the components that Frankendancer shares with Agave.
The Reliability Numbers in Context
It is worth noting that Solana’s reliability has improved dramatically. Since the February 2024 outage, the network has maintained continuous uptime through record-breaking transaction volumes. The 2024-2025 period saw:
- Daily transaction counts regularly exceeding 50 million
- Peak throughput above 4,000 TPS sustained
- Zero major outages for over 12 months
This improvement came primarily from Agave client fixes, not from Firedancer adoption. Vote transaction prioritization, memory management improvements, and Turbine protocol hardening were all Agave-side changes.
Why a Second Client Is Still Critical
Despite the improved reliability, a second independent client remains essential for three reasons:
-
Systematic blind spots: Every codebase has bugs that its developers cannot see because they share the same mental model. An independent team implementing the same protocol from scratch will make different assumptions and catch different edge cases.
-
Operational resilience: If a zero-day vulnerability is discovered in Agave, validators need an alternative they can switch to immediately. Without Firedancer, the only option during an Agave critical vulnerability is to shut down the network entirely while a patch is developed.
-
Performance competition: Firedancer’s existence has already motivated Agave performance improvements. The competitive pressure between two client teams produces better software for the entire ecosystem.
The Bottom Line
Solana’s outage history strongly supports the need for client diversity. But the specific kind of diversity matters enormously. Frankendancer provides meaningful networking diversity that would have prevented 2-3 of the historical outages. Full Firedancer would provide comprehensive diversity that addresses all categories.
The current 21% Frankendancer stake is progress, but it is not the finish line. The ecosystem needs to push toward full Firedancer adoption and the 33% threshold as quickly as operational safety allows.
Lisa Rodriguez is an L2 scaling engineer and former infrastructure lead at Polygon and Optimism.