Directed Acyclic Graph (DAG) in Blockchain
What is a DAG and How Does it Differ from a Blockchain?
A Directed Acyclic Graph (DAG) is a type of data structure consisting of vertices (nodes) connected by directed edges that never form a cycle. In the context of distributed ledgers, a DAG-based ledger organizes transactions or events in a web-like graph rather than a single sequential chain. This means that unlike a traditional blockchain where each new block references only one predecessor (forming a linear chain), a node in a DAG may reference multiple previous transactions or blocks. As a result, many transactions can be confirmed in parallel, rather than strictly one-by-one in chronological blocks.
To illustrate the difference, if a blockchain looks like a long chain of blocks (each block containing many transactions), a DAG-based ledger looks more like a tree or web of individual transactions. Every new transaction in a DAG can attach to (and thereby validate) one or more earlier transactions, instead of waiting to be packaged into the next single block. This structural difference leads to several key distinctions:
- Parallel Validation: In blockchains, miners/validators add one block at a time to the chain, so transactions are confirmed in batches per new block. In DAGs, multiple transactions (or small “blocks” of transactions) can be added concurrently, since each can attach to different parts of the graph. This parallelization means DAG networks don’t have to wait for a single long chain to grow one block at a time.
- No Global Sequential Order: A blockchain inherently creates a total order of transactions (every block has a definite place in one sequence). A DAG ledger, by contrast, forms a partial order of transactions. There is no single “latest block” that all transactions queue for; instead, many tips of the graph can coexist and be extended simultaneously. Consensus protocols are then needed to eventually sort out or agree on the order or validity of transactions in the DAG.
- Transaction Confirmation: In a blockchain, transactions are confirmed when they are included in a mined/validated block and that block becomes part of the accepted chain (often after more blocks are added on top). In DAG systems, a new transaction itself helps confirm previous transactions by referencing them. For example, in IOTA’s Tangle (a DAG), each transaction must approve two previous transactions, effectively having users collaboratively validate each other’s transactions. This removes the strict division between “transaction creators” and “validators” that exists in blockchain mining – every participant issuing a transaction also does a bit of validation work.
Importantly, a blockchain is actually a special case of a DAG – a DAG that has been constrained to a single chain of blocks. Both are forms of distributed ledger technology (DLT) and share goals like immutability and decentralization. However, DAG-based ledgers are “blockless” or multi-parent in structure, which gives them different properties in practice. Traditional blockchains like Bitcoin and Ethereum use sequential blocks and often discard any competing blocks (forks), whereas DAG ledgers attempt to incorporate and arrange all transactions without discarding any, as long as they’re not conflicting. This fundamental difference lays the groundwork for the contrasts in performance and design detailed below.
Technical Comparison: DAG vs. Blockchain Architecture
To better understand DAGs vs blockchains, we can compare their architectures and validation processes:
- Data Structure: Blockchains store data in blocks linked in a linear sequence (each block contains many transactions and points to a single previous block, forming one long chain). DAG ledgers use a graph structure: each node in the graph represents a transaction or an event block, and it can link to multiple previous nodes. This directed graph has no cycles, meaning if you follow the links “backwards” you can never loop back to a transaction you started from. The lack of cycles allows a topological ordering of transactions (a way to sort them so that every reference comes after the referenced transaction). In short, blockchains = one-dimensional chain, DAGs = multi-dimensional graph.
- Throughput and Concurrency: Because of the structural differences, blockchains and DAGs handle throughput differently. A blockchain, even under optimal conditions, adds blocks one by one (often waiting for each block to be validated and propagated network-wide before the next one). This inherently limits transaction throughput – for example, Bitcoin averages 5–7 transactions per second (TPS) and Ethereum ~15–30 TPS under the classic proof-of-work design. DAG-based systems, by contrast, allow many new transactions/blocks to enter the ledger concurrently. Multiple branches of transactions can grow simultaneously and later mesh together, dramatically increasing potential throughput. Some modern DAG networks claim throughput in the thousands of TPS, approaching or exceeding traditional payment networks in capacity.
- Transaction Validation Process: In blockchain networks, transactions wait in a mempool and are validated when a miner or validator packages them into a new block, then other nodes verify that block against the history. In DAG networks, validation is often more continuous and decentralized: each new transaction carries out a validation action by referencing (approving) earlier transactions. For example, each transaction in IOTA’s Tangle must confirm two previous transactions by checking their validity and doing a small proof-of-work, thereby “voting” for those transactions. In Nano’s block-lattice DAG, each account’s transactions form their own chain and are validated via votes by representative nodes (more on this later). The net effect is that DAGs spread out the work of validation: rather than a single block producer validating a batch of transactions, every participant or many validators concurrently validate different transactions.
- Consensus Mechanism: Both blockchains and DAGs need a way for the network to agree on the state of the ledger (which transactions are confirmed and in what order). In blockchains, consensus often comes from Proof of Work or Proof of Stake producing the next block and the rule of “longest (or heaviest) chain wins”. In DAG ledgers, consensus can be more complex since there isn’t a single chain. Different DAG projects use different approaches: some use gossip protocols and virtual voting (as in Hedera Hashgraph) to come to agreement on transaction order, others use Markov Chain Monte Carlo tip selection (IOTA’s early approach) or other voting schemes to decide which branches of the graph are preferred. We will discuss specific consensus methods in DAG systems in a later section. Generally, reaching network-wide agreement in a DAG can be faster in terms of throughput, but it requires careful design to handle conflicts (like double-spend attempts) since multiple transactions can exist in parallel before final ordering.
- Fork Handling: In a blockchain, a “fork” (two blocks mined at nearly the same time) results in one branch eventually winning (longest chain) and the other being orphaned (discarded), which wastes any work done on the orphan. In a DAG, the philosophy is to accept forks as additional branches of the graph rather than waste them. The DAG will incorporate both forks; the consensus algorithm then determines which transactions end up confirmed (or how conflicting transactions are resolved) without throwing away all of one branch. This means no mining power or effort is wasted on stale blocks, contributing to efficiency. For example, Conflux’s Tree-Graph (a PoW DAG) attempts to include all blocks in the ledger and orders them, rather than orphaning any, thereby utilizing 100% of produced blocks.
In summary, blockchains offer a simpler, strictly ordered structure where validation is block-by-block, whereas DAGs provide a more complex graph structure allowing asynchronous and parallel transaction processing. DAG-based ledgers must employ additional consensus logic to manage this complexity, but they promise significantly higher throughput and efficiency by utilizing the network’s full capacity rather than forcing a single-file queue of blocks.