Ethereum Scaling Solutions

The debate on scaling is not specific to a single blockchain and has come up on multiple occasions regarding Bitcoin in the past. Right now, however, the debate focuses mostly on Ethereum, its upcoming merge and relatively new layer 2 solutions – which we’ll concentrate on in this article.

Given the number of people using Ethereum, and the corresponding increase in transactions and dApp interactions, the blockchain has reached certain capacity limitations, at around 10-20 transactions per second (or TPS). In comparison, the Visa network processes around 1,700 transactions per second, with a claimed theoretical capacity of up to 24,000 TPS (link). Given its limitations, transactions on Ethereum are in competition to get included in a block. Increased demand then increases prices for the execution of transactions (referred to as ‘gas fees’). Figure 1 shows Ether price (in grey) and average transaction fees (in blue). At its peaks, average fees for a single transaction were as high as $40. Such fees are simply not viable and broader adoption of the Ethereum blockchain requires increased transaction throughput, lower fees, and faster transaction speeds.

Figure 1: Historical Average Transaction Fees on Ethereum. Source: Messari

The main reason for Ethereum to have this low transaction speed and throughput is its focus on decentralization and security. Ethereum co-founder Vitalik Buterin famously described the “scalability trilemma” (see Figure 2): how usually there’s a tradeoff between decentralization, security, and scalability. He then goes on to describe “sharding” (which we’ll look at below) as the solution to this problem at the blockchain level (link). In addition to scaling the Ethereum blockchain itself, it’s also possible to decrease the number of transactions on the chain, either by bundling them together or by executing them on a separate blockchain altogether. The aim these techniques have in common is raising the system’s ability to handle an exponential increase in usage, without sacrificing functionality – that is, increase transaction throughput and speed, to reduce transaction fees, while keeping and relying on the security guarantees and decentralization of the base layer (Ethereum).

Figure 2: The Scalability Trilemma. Source: Vitalik.eth

Types of scaling solutions.

As mentioned, there are two main categories of scaling solution: ‘layer 1’ (scaling the blockchain itself) and ‘layer 2’ (offloading part of the computation / execution). First, we’ll quickly discuss two main ways of scaling the blockchain itself, then dive into various layer 2 solutions.

Layer 1 scaling

Increasing block size: the most straightforward way to scale any blockchain is simply increasing the permitted block size. Currently, Ethereum limits the data per block to 1MB (this is on average: for the precise mechanism refer to EIP-1559). This limit is set low on purpose, since it allows almost any computer to participate in the network as a full node, and thus improves decentralization. Increasing the block size would allow miners to include more transactions, and in turn (linearly) increase the transaction throughput of the blockchain. At the same time however, such an increase would impact decentralization, because as block size grows, the chain’s size increases and faster bandwidth is required to run a full node. In the extreme case, only specialized supercomputers could manage to run a full node. For this reason, the Ethereum community has ruled out larger block sizes as a scaling solution. It’s interesting to note that similar discussions have regularly arisen in the Bitcoin community, even leading to forks such as Bitcoin Cash and Bitcoin SV (link).

Sharding: in essence, this refers to increasing scalability by introducing parallel execution of transactions in place of sequential execution on a single chain (the way Ethereum currently operates – see link). This is achieved by dividing the chain into smaller chains (or ‘shards’). On each shard, a set of collators is responsible for approving sets of transactions concurrently, which are subsequently added to the main chain. This mechanism can significantly increase the TPS without sacrificing on decentralization or security. Figure 3 provides an overview of sharding – more detail can be found in Vitalik’s blog (link).

Figure 3: One of the many depictions of a sharded version of Ethereum. Source: Original diagram by Hsiao-wei Wang, design by Quantstamp.

Layer 2 scaling

This refers to scaling solutions deriving their security from Ethereum but handling transactions off-chain. Sidechains will be described in this article as part of layer 2 scaling, as they also handle transactions off-chain – even though these usually do not, or only to a limited degree, derive their security from Ethereum.

Most layer 2 solutions rely on nodes (with varying terminology, such as ‘validator’ and ‘sequencer’) that collect the associated transactions, instead of submitting them directly to layer 1 (i.e. Ethereum Mainnet). These transactions are then settled off-chain, or batched together into groups before anchoring them on layer 1, making them non-alterable. There are many implementations of this, each with their own pros and cons, and optimal use cases. They all have in common, however, increasing transaction throughput and reducing network congestion on Mainnet – resulting in a more accessible Ethereum.

Rollups: rollup-based layer 2 solutions move transaction execution outside of layer 1 and combine, or ‘roll up’, several executed transactions into a batch. These batches can consist of hundreds of transactions, and related transaction data is then sent back to Mainnet. This substantially increases transaction throughput, while validity checks and anchoring to layer 1 guarantee security. In addition, rollups usually feature fast transaction confirmations, as they don’t need full inclusion on layer 1 for sufficient finality.

There are two main categories of rollups, which differ on how transactions are computed and anchored to Ethereum:

Optimistic rollups: these are based on the (optimistic) assumption that transactions are valid by default, and thus there is no need to calculate a validity proof for each transaction bundle. To ensure that only valid transactions are executed, they offer the option to challenge any transaction via a fraud proof, which would trigger the respective computation to confirm validity. Ethereum layer 2s using optimistic rollups include Optimism (link) and Arbitrum (link). More information on optimistic rollups can be found here (link).
Zero-knowledge rollups: ZK-rollups are more complex, since they perform an off-chain computation to create a cryptographic proof that the transactions it includes are valid. This is done through a Succinct Non-Interactive Argument of Knowledge (SNARK) or a Succinct Transparent Argument of Knowledge (STARK). Having these proofs for each transaction bundle makes ZK-rollups even more secure, at the cost of increased complexity (resulting in limited functionality), as well as slower execution. Ethereum layer 2s working with ZK-rollups include zkSYNC (link), Loopring (link), and StarkWare (link). More on ZK-rollups can be found here (link).

State channels: these are, essentially, joint multi-signature smart contracts, with locked funds, between two parties – and that execute only with the approval of the required parties. Each small transaction will require a part-signed transaction, which the other party could sign and propagate to the network at any time to close the channel. This allows for many bilateral transactions, with only two transactions required to be on-chain – an entry (open state channel) transaction and an exit (close state channel) transaction. A simple way to think of state channels is the concept of an ‘open tab’, where numerous orders can be made before settling the final bill, with that final transaction recorded on layer 1.

Well-known projects building on state channels are Celer Network (link), Raiden Network (link), and probably the most famous project using a similar concept of payment channels: the Lightning Network on Bitcoin (link).

Plasma: this scaling works through a concept of ‘child chains’ – separate blockchains anchored to the Ethereum Mainnet by a smart contract, in turn locking-up Ether. They are then built on fraud-proofs, a concept similar to the optimistic rollups covered earlier. However, the main difference is that the separate blockchain uses its own mechanism for block validation, and subsequently publishes cryptographic proofs for each block (think of small proofs that can be used to verify information about the included transactions) on the Ethereum Mainnet, to gain security. There are some limitations however, as data availability is kept off-chain – which is different to the concept of rollups.

Plasma was one of the first concepts for layer 2 scaling, but it has not gained the traction of other solutions. One example project is Gluon Network (link).

Validium: like Plasma, this pairs optimistic rollups with off-chain data availability. Validium can be described as using validity proofs like ZK-rollups, paired with off-chain data availability (instead of publishing everything on-chain). Again, this involves some tradeoffs, due to the reliance on external data providers, but it also leads to higher transaction throughput.

Examples for Validium-based scaling solutions are Immutable X (link) and, again, StarkWare (link) which can operate in the forms of ZK-rollup and Validium.

Sidechains: not a layer 2 solution in the strict sense, sidechains are independent blockchains using their own consensus mechanism, block parameters and incentives, while remaining Ethereum Virtual Machine (EVM)-compatible (“in order to support smart contracts”). Therefore, these chains can interact with Ethereum Mainnet and rely on cross-chain bridges to transfer assets between chains. The biggest difference to layer 2 solutions is that sidechains don’t enjoy any security assurances from Ethereum.

Well known examples include Polygon (link) and SKALE (link). To an extent, independent layer 1 blockchains with EVM-compatibility can also function as Ethereum sidechains. Examples include Binance Smart Chain (link), Avalanche (link), and Fantom (link).

Why layer 2 scaling solutions are needed

Having discussed several ways to scale Ethereum, let’s recap why we need them. As outlined in the introduction, the layer 1 blockchain has limited capacity and any increase in block size would potentially reduce decentralization. Introducing layer 2 scaling makes it possible to offload transactions from the main chain, while profiting from the underlying chain’s security guarantees. Ultimately, this results in:

less congestion on the network and lower transaction fees
faster transactions and increased throughput
maintaining security, since these solutions build on Ethereum’s guarantees and do not negatively impact decentralization

We saw there are countless projects working on scalability – an advantage, as it all helps reduce congestion – and multiple approaches prevent single points of failure. In addition, this allows for tailored, application-specific solutions which can be used for different types of project.