Pipelining in Solana: The Transaction Processing Unit
To get to sub-second confirmation times and the transactional capacity required for Solana to become the world’s first web-scale blockchain, it’s not enough to just form consensus quickly. The team had to develop a way to quickly validate massive blocks of transactions, while quickly replicating them across the network. To achieve this, the process of transaction validation on the Solana network makes extensive use of an optimization common in CPU design called pipelining.
Pipelining is an appropriate process when there’s a stream of input data that needs to be processed by a sequence of steps and there’s different hardware responsible for each. The quintessential metaphor to explain this is a washer and dryer that wash/dry/fold several loads of laundry in sequence. Washing must occur before drying and drying before folding, but each of the three operations is performed by a separate unit.
To maximize efficiency, one creates a pipeline of stages. We’ll call the washer one stage, the dryer another, and the folding process a third. To run the pipeline, one adds a second load of laundry to the washer just after the first load is added to the dryer. Likewise, the third load is added to the washer after the second is in the dryer and the first is being folded. In this way, one can make progress on three loads of laundry simultaneously. Given infinite loads, the pipeline will consistently complete a load at the rate of the slowest stage in the pipeline.
“We needed to find a way to keep all hardware busy all the time. That’s the network cards, the CPU cores and all the GPU cores. To do it, we borrowed a page from CPU design”, explains Solana Founder and CTO Greg Fitzgerald. “We created a four stage transaction processor in software. We call it the TPU, our Transaction Processing Unit.”
On the Solana network, the pipeline mechanism — Transaction Processing Unit — progresses through Data Fetching at the kernel level, Signature Verification at the GPU level, Banking at the CPU level, and Writing at the kernel space. By the time the TPU starts to send blocks out to the validators, it’s already fetched in the next set of packets, verified their signatures, and begun crediting tokens.
The Validator node simultaneously runs two pipelined processes, one used in leader mode called the TPU and one used in validator mode called the TVU. In both cases, the hardware being pipelined is the same, the network input, the GPU cards, the CPU cores, writes to disk, and the network output. What it does with that hardware is different. The TPU exists to create ledger entries whereas the TVU exists to validate them.
“We knew that signature verification was going to be a bottleneck, but also that it’s this context-free operation that we could offload to the GPU,” says Fitzgersald. “Even after offloading this most expensive operation, there’s still a number of additional bottlenecks, such as interacting with the network drivers and managing the data dependencies within smart contracts that limit concurrency.”
Between the GPU parallelization in this four-stage pipeline, at any given moment, The Solana TPU can be making progress on 50,000 transactions simultaneously. “This can all be achieved with an off-the-shelf computer for under $5000,” explains Fitzgerland. “Not some supercomputer.”
With the GPU offloading onto Solana’s Transaction Processing Unit, the network can affect single node efficiency. Achieving this has been the goal of Solana since inception.
“The next challenge is to somehow get the blocks somehow from the leader node out to all the validator nodes, and to do it in a way that doesn’t congest the network and bring throughput to a crawl,” continues Fitzgerald. “For that, we’ve come up with a block propagation strategy that we call Turbine.
“With Turbine, we structure the Validator nodes into multiple levels, where each level is at least twice the size of the one above it. By having this structure, these distinct levels, confirmation time ends up being proportional to the height of the tree and not the number of nodes in it, which is far greater. Every time the network doubles in size, you’ll see a small bump in confirmation time, but that’s it.”
In addition to technological implementations like Pipelining, there are several key innovations that make Solana’s web-scale blockchain functionality possible. For a deeper understanding of them all, you can read about them on the Solana blog:
7 key innovations that make the Solana network possible:
- Proof of History (POH)— a clock before consensus;
- Tower BFT — a PoH-optimized version of PBFT;
- Turbine — a block propagation protocol;
- Gulf Stream— Mempool-less transaction forwarding protocol;
- Sealevel — Parallel smart contracts run-time;
- Cloudbreak— Horizontally-Scaled Accounts Database; and
- Replicators— Distributed ledger store