Bittensor's SN3 Bets the Network on a Trillion-Parameter Training Run

April 22, 2026 · 11 min read

Software Engineer

In March 2026, a few dozen anonymous miners on home internet connections trained a 72-billion-parameter language model that scored within striking distance of Meta's Llama 2 70B. Six weeks later, the team that led that effort walked out, dumped $10 million worth of TAO, and called Bittensor's decentralization "theatre." Now the surviving community wants to do it again — at fourteen times the scale, in roughly four weeks, with the entire decentralized AI thesis riding on the result.

This is the story of how Bittensor's Subnet 3 — recently rebranded Teutonic after the Covenant AI exit — talked itself into a 1-trillion-parameter training run timed to land squarely in Grayscale's TAO ETF SEC review window. It's a wager that the protocol's incentive layer is more important than the people who built it, and that the same network that survived a governance crisis can ship the "DeepSeek moment" for decentralized AI before regulators decide whether to let Wall Street buy in.

How a 72B model became the high-water mark for permissionless AI

The story starts on March 10, 2026, when Subnet 3 — then operating under the name Templar — announced Covenant-72B, a 72-billion-parameter model trained on roughly 1.1 trillion tokens by more than 70 independent miners coordinating across the public internet. It was, by a wide margin, the largest decentralized LLM pre-training run ever completed.

The benchmark that mattered: an MMLU score of 67.1, putting Covenant-72B in the same neighborhood as Meta's Llama 2 70B — a model produced by one of the best-funded AI labs on the planet. NVIDIA CEO Jensen Huang publicly compared the effort to a "modern folding@home for AI." Templar's subnet token surged, and at peak its market valuation crossed $1.5 billion.

The technical breakthrough wasn't the model architecture. It was the coordination layer. Two pieces did the heavy lifting:

SparseLoCo, a communication-efficient training algorithm that reduced inter-node bandwidth requirements by 146x through sparsification, 2-bit quantization, and error feedback. Without it, a frontier-scale training run on residential internet would be physically impossible — gradient sync alone would saturate every miner's connection.
Gauntlet, Bittensor's blockchain-validated incentive system that scored each miner's contribution via loss evaluation and OpenSkill rankings, paying TAO to the high-quality nodes and slashing the rest.

Together they produced something genuinely new: a permissionless network of anonymous contributors, coordinating only through cryptographic incentives, training a model competitive with billion-dollar lab outputs.

Then it broke.

The Covenant exit: $900 million erased in twelve hours

On April 10, 2026, Sam Dare — founder of Covenant AI, the team behind three of Bittensor's most valuable subnets (SN3 Templar, SN39 Basilica, and SN81 Grail) — announced he was leaving. Within hours he liquidated approximately 37,000 TAO, roughly $10.2 million, and published a parting accusation: that co-founder Jacob Steeves ("Const") wielded centralized control over the protocol, and that Bittensor's decentralization was performance, not architecture.

The market reaction was immediate. TAO crashed 20–28% depending on the measurement window, erasing roughly $650–900 million in market cap inside a 12-hour span. Subnet alpha tokens fared worse — Grail (SN81) was down 67% at the bottom. Around $10 million in long positions liquidated.

Two facts blunted the panic:

The subnets didn't die. Community miners restarted SN3, SN39, and SN81 from open-source code without a central operator. The infrastructure Covenant built was, in fact, recoverable from the public artifacts — which arguably proves the decentralization thesis Dare disputed.
70% of TAO supply remained staked through the disruption. Long-term holders didn't follow Dare to the exit.

But the network had a credibility problem. If Covenant — the team that delivered Bittensor's marquee technical achievement — could leave at the top and crater the token, what stops the next subnet operator from doing the same?

The Conviction Mechanism: locking in the people who can leave

Const's response landed on April 20, 2026, ten days after Dare walked. BIT-0011, branded the Conviction Mechanism, proposes a Locked Stake regime that forces subnet owners to time-lock TAO for months or years in exchange for a "conviction score" that maps to voting rights and subnet ownership.

The mechanics:

The conviction score starts at 100% and decays over 30-day intervals if tokens aren't replenished into the lock-up.
Voting power and ownership rights diminish in lockstep with the decay, making sudden capital flight economically expensive rather than just embarrassing.
The system targets the mature subnets first — SN3, SN39, and SN81 — exactly the three that Covenant ran.

The dark joke: BIT-0011 was reportedly drafted by Sam Dare himself before his exit. The departing founder wrote the rules designed to prevent founders from departing.

The proposal addresses a real structural weakness — subnet operators could previously dump positions with no governance penalty — but it also concentrates power in the hands of long-term lockers, which is its own form of centralization. Whether that's the right trade depends on what you think Bittensor's main risk is: founder defection or oligarchic capture.

Teutonic and the trillion-parameter moonshot

Against that backdrop, the rebranded Teutonic subnet (SN3, formerly Templar) has committed publicly to a 1-trillion-parameter decentralized training run for mid-to-late May 2026. That's roughly 14x the scale of Covenant-72B, on the same fundamental architecture, with a community-restored team rather than the original Covenant engineers.

The strategic timing is impossible to miss. Grayscale filed its S-1 amendment for the spot Bittensor Trust ETF (proposed ticker GTAO) on NYSE Arca on April 2, 2026. The SEC's decision window is currently tracked for August 2026. A successful 1T-parameter training run in May would land at the peak of regulator deliberation — exactly when "is this a real technology or a meme?" becomes the load-bearing question. Grayscale already raised TAO's weighting inside its broader AI fund to 43.06% on April 7, the largest single-asset reallocation that fund has ever made.

The bull case writes itself: ship a credible 1T-parameter decentralized model, become the "DeepSeek moment" the ETF approval needs to justify institutional inflow, and reprice the entire decentralized AI category in one quarter.

The bear case is engineering, not marketing.

Why scaling decentralized training is hard in ways frontier labs don't face

Centralized 1T+ models — GPT-5, Claude 4.7 Opus, Gemini 2.5 Ultra — are trained inside facilities where every GPU is wired to every other GPU through purpose-built fabrics like NVLink and InfiniBand, with sub-microsecond latencies and terabit-per-second bandwidth. Even in those conditions, gradient synchronization is the bottleneck. Published research consistently finds that over 90% of LLM training time can be spent on communication rather than compute when scaling is naive.

Teutonic's miners are coordinating across ~100ms WAN latencies on residential internet. The only reason Covenant-72B was possible at all is SparseLoCo's 146x compression of communication volume. Pushing to 1T parameters changes the math in three uncomfortable ways:

Gradient size scales roughly linearly with parameter count. A 14x model means 14x as much data to synchronize per step, even before considering optimizer state.
Cross-node coordination overhead historically scales super-linearly with worker count. If Teutonic doubles its node pool from ~70 to ~256, the all-reduce communication cost doesn't just double — it can grow by 4–10x depending on topology.
Failure modes compound. A node dropping out mid-step in a 70-node network is a small slashing event. In a 256-node network running 14x larger gradients, the same drop can stall the entire training round.

None of this is unsolvable. There's a body of decentralized training research — heterogeneous low-bandwidth pre-training, FusionLLM, communication-computation overlap, delayed gradient compensation — that targets exactly this regime. But almost all of it has been validated at the 7B–70B scale. A 1T-parameter run on geographically distributed commodity hardware would be a research contribution in its own right, not just a product launch.

The honest read: Teutonic is taking on a research-grade engineering challenge with a marketing-grade deadline. Either it works and becomes the credibility event the entire dTAO ecosystem needs, or it stalls publicly during the SEC's most attentive review window.

The decentralized AI training landscape Teutonic must survive

Teutonic isn't the only project trying to claim the "credible decentralized 1T-param" milestone in 2026. The competitive map is filling out fast:

Gensyn launched its mainnet on April 22, 2026 — the same day this article goes out — pairing the launch with Delphi Markets, an AI-driven matching layer for compute jobs. By close of day Gensyn was reporting hashrate equivalent to 5,000+ NVIDIA H100s. Where Bittensor sells permissionless coordination plus a token-incentive flywheel, Gensyn is positioning as a verifiable AI compute marketplace with cryptographic proofs of correct execution.
Ritual has gone in the opposite direction, leaning into inference rather than training. Its Infernet technology lets any smart contract request an AI output and receive cryptographic proof that the specified model was used unmodified. That's the "verifiable AI in DeFi" thesis, not the "train frontier models from scratch" thesis.
Ambient and Origins Network are making adjacent bets — different incentive designs, different verification strategies, similar long-term goal of breaking centralized labs' monopoly on frontier training.

These projects don't directly compete on the same milestone, but they all compete for the same finite pool of attention and capital. If Gensyn's mainnet captures the "decentralized AI is here" narrative through commercial workloads, Teutonic's May training run becomes a referendum on whether Bittensor's specific approach — subnet competition plus token-weighted incentives — is the right architecture or the first iteration that gets surpassed.

Why this matters beyond TAO

Three things are getting tested simultaneously over the next four to six weeks:

Whether decentralized training scales. If Teutonic succeeds, the "Bitcoin of decentralized AI compute" thesis survives. If it fails, the Covenant exit reads as the moment subnet-based training peaked — a 72B ceiling rather than a 72B foundation.

Whether the Conviction Mechanism is the right governance fix. Locking in subnet operators prevents another Covenant-style dump but creates a new failure mode where long-term lockers can entrench. Bitcoin Core's distributed maintainer model, Solana Labs' continued centralized core development, and Sui's Mysten Labs concentration are three different answers to the same question — whether protocol complexity demands a strong central maintainer the community must trust. Bittensor is now running its own version of that experiment in real time.

Whether the ETF window forces decentralized AI to ship on TradFi's calendar. The SEC's August decision window is a hard deadline for a narrative that wants to be "DeepSeek moment" rather than "interesting research project." That's a healthy forcing function or a recipe for over-promising — depending on what gets shipped.

For builders watching from the infrastructure side, the underlying signal is simpler: AI agents and decentralized training networks are about to generate a new tier of on-chain query load — model registry lookups, attestation proofs, gradient checkpoint hashes, subnet performance data — that doesn't fit neatly into the human-facing dApp pattern existing RPC infrastructure was built for.

BlockEden.xyz provides enterprise-grade RPC and indexing infrastructure across 27+ chains for teams building the AI-meets-crypto stack. Explore our API marketplace to build on rails designed for both human and machine traffic.

Sources

Share on Twitter

API Marketplace Featured

How a 72B model became the high-water mark for permissionless AI​

The Covenant exit: $900 million erased in twelve hours​

The Conviction Mechanism: locking in the people who can leave​

Teutonic and the trillion-parameter moonshot​

Why scaling decentralized training is hard in ways frontier labs don't face​

The decentralized AI training landscape Teutonic must survive​

Why this matters beyond TAO​

Sources​