So You Want to Build a Blockchain
You realize one day that you don’t understand this whole blockchain thing. To rectify, you decide to build one yourself. Excellent decision! Nothing beats jumping right into it, writing down some code, observing important design decisions you’d never considered unveiling themselves. But maybe you’re a bit lost, and would like a little guidance. I built a blockchain from scratch not so long ago. While I’m still in no way an expert in anything blockchain related, I’d like to share my experience and make the beginning of your journey a little easier.
This is not a tutorial
Thinking through everything from first principle, stumbling into past solutions and realizing why they didn’t work, marveling at the ingenuity of current solutions when you get stuck and having to look them up – these are delightful experiences I don’t want to take away from you.
My problem with existing “build a blockchain” tutorials is that they
- Start in the middle,
- Are not blockchain ecosystem agnostic, and
- Send the wrong message that being a blockchain client is a single player game.
They usually tell you that you need to write a
Blockchain class, with specific properties. Then they say you need to understand and implement Proof of Work. Maybe they say you need to calculate gas fees. Then you write a mining client. By the end, you would have written something that can generate a blockchain, but I’d argue that you wouldn’t have deepened your understanding substantially.
In this guide there is no solution, no code example, no step-by-step to follow. There are only questions to think about – and some directions that might help you answer them. That said, I do assume a few things: that you are familiar with “how blockchain works” at a high level, you’re familiar with related cryptographic ideas, and you have enough coding experience to code up your own thing given some general design directions.
Shhh… You’re not actually “building a blockchain”
There’s no such thing as only “building a blockchain” – what you’re building is a blockchain client: a node which, like many other nodes in the same blockchain ecosystem, participates in the consensus-seeking process. This node, and other nodes like it, together have to agree on what the “canonical” blockchain – series of blocks, each contains a list of transactions – is. So how does it do that?
What tech stack should I use?
It’s not too important which language you use; you’re not building a full-blown, performant client. Use what you’re comfortable with.
Transactions could be anything
You might find yourself wondering which transaction model you should use: the Ethereum model, the Bitcoin UTXO model, or something else. The important thing is abstracting this away: to have a standard format for your transactions, and knowing how you could check if a transaction is valid. Don’t get hung up on this detail. It’s a very, very small part of the system.
What would a malicious party do?
It’s not enough to build a client that creates a blockchain in a harmonious fashion. To fully appreciate design decisions, you always have to assume that someone somewhere (or several someones) is malicious and will try to cheat the system. This simple assumption underlines many, if not most, blockchain design decisions. If all nodes were honest, we wouldn’t even need Proof of Work!
So what’s in a blockchain client?
Assuming we’re building an all-purpose client: mining, validating, all that. Let’s think about what we need at high level.
Clients need to talk to each other, over the internet, so you’ll need some networking protocol. This protocol could be anything – you could even create your own proprietary protocol! Networking isn’t the point of this exercise though, so I suggest going with something simple, like HTTP.
Let’s begin by thinking about what happens when you start up a brand new client for the first time.
- The first time a client starts, it’s not connected to any other clients (peers). How does it connect to the first peer? How can a malicious party take advantage of this? Can you bootstrap peer discovery without implicit trust / compromising security?
- How does it discover more peers?
- How many peers are enough peers?
- Should all peers be treated the same? Over time, you do gain information about each peer through what they send you. Should you do anything with this information?
We’ve only talked about blockchain clients, but there’s another group of participants in this ecosystem: blockchain users. Users have “wallets” or sets of public-private key pairs, sign transactions, send signed transactions to clients so these transactions can be included in an upcoming block. So now we have two different interfaces: between a user and a client, and between a client and another client.
User and client
At bare minimum, a user should be able to send a signed transaction to a client. There’s almost definitely a crypto library in your language that can this with a few lines of code. You can get fancy here and write a very responsive interface. This wasn’t the part I found most interesting, so I didn’t.
Client and client
Before we go further, let’s pause and consider why clients really need to talk to each other. As a client operates, it receives a stream of transactions from blockchain users who, somehow, are aware of its existence. These users might send their transactions to one client, or they might send the same transactions to a few clients – perhaps to make sure that their transactions aren’t “dropped” by any single client. Each client sees only a small subset of all transactions. And then, somehow, collectively, they all have to agree on a global ledger – a set of all transactions ever sent to the network, in a specific order.
Wait, can we just have each client forward all transactions it receives to all of its peers, repeatedly? Can’t we just have a “transaction chain” instead of a blockchain? What’s wrong with that?
I’ll leave the exercise of thinking about what’s wrong with having a “transaction chain” to you. In the mean time, we’re going to skip a few steps ahead in our reasoning, since we know what the “solution” is: to group transactions into blocks, to agree on what the next block is, and then propagate the appointed new block to the entire network. So what do we need for this interface?
- Clients receive transactions. Should these transactions be passed along to other peers? How does a client know that a transaction it just received, either from a user or from another client, is not a duplicate?
- Clients also form new blocks (in PoW, this is through mining) or receive new blocks from other peers. To propagate new blocks, it needs to send its new blocks to its peers
- But wait, there’s an edge case! When a client starts up for the first time, it has no blocks or transactions.
-It needs to get this information from its peers. But what does it need – the entire blockchain (this is very large)? Or can it get away with only the very last block?
- What if the peer is malicious, and send it fake blocks? How do we protect against this?
Forming a new block
As a client, you gather a set of transactions sent to you by users or other clients. You want to make this the new block. But it’s no good that you and only you think it’s the new block; other clients have to agree.
If you’ve thought through the issues with a chain of transactions, you’ll realize that the same issues apply to a chain of blocks. To make this work, we need most of the network to agree on what the latest block is, before sending them yet another new block to agree on. In other words, we need a way to delay block creation, so that there’s time to reach consensus. Hence, Proof of Work.
We should realize here that the same solution applies to the transaction chain as much as it applies to the blockchain. We could apply Proof of Work to each transaction. But this would be terribly slow – one transaction per block discovery time slow: one transaction per ten minutes for bitcoin, or one transaction per nine seconds for Ethereum. In those blockchains, the difficulty of the PoW is set by the system to ensure that one block is mined every X amount of time, on average. We see now that the throughput of the system depends on block size. Why not just increase the max block size to increase transaction throughput?
I actually couldn’t come up with a convincing answer and had to look this up. But with this knowledge, I can instead help guide you in thinking about this. Let’s think through what happens when a blocks are bigger:
- Mining is a loop of just incrementing a number, computing a hash and do a check if the hash is correct. How does this change as blocks get bigger?
- We haven’t talked about this yet, but as clients mine new blocks or receive blocks from peers, they typically validate and execute all the transactions in the block. How does a bigger block size affect this?
- New blocks propagate by being sent over the internet from one client to its peers, ad infinitum (and, mosly likely, getting validated in between). How does a bigger block size affect this?
As block size goes up, the amount of computing resource required to be competitive also goes up (definitely see for yourself that this is true!). Non-competitive clients can’t profit and don’t stick around. The number of clients therefore decreases, resulting in centralization of the network.
Ethereum processes ~15 transactions per seconds. This low number is seen by uninformed supporters of other L1 blockchains as proof of the ecosystem’s incompetence and stagnant growth. We see here, however, that Ethereum can in theory “easily” increase throughput by increasing the block size. But they made a philosophical decision not to, to prioritize decentralization. Many critics don’t see (or don’t care) about the trade-off.
How does this affect the simple blockchain you’re building? Well, it doesn’t have to! This train of thoughts is just where trains of thoughts naturally wander to; you can choose your preferred level of complexity to implement in your project – anywhere between totally ignoring block sizes and writing a detailed simulation of what’d happen to the network when block size and propagation time varies.
Receiving new blocks
Sometimes, you also receive a new block from a peer. We just need to pass it on to other peers, simple! Right?
Not quite. The block could come from a malicious party, containing a transaction sending a billion dollars from some random account to a mysterious address. We don’t want that to happen, so we have to validate that the block is “correct.”
For that matter, how do you know that the transactions you included in your own mined blocks aren’t invalid? You have to validate that too. Or do you? Let’s think about this for a second. You definitely aren’t forced to do it, code are not law here because you aren’t forced to use any provided blockchain client. You can write your own client with your own mining function that doesn’t care at all about validating transactions. Executing and validating transactions could take a lot of computing resources, so if you can just skip that part and focus your resources on mining, that would be awesome.
Again, I’ll leave the exercise of thinking through this to you. For me, this conceptually changed the way I think about blockchains – from “this is how blockchain works” to there is no right way to do things, and there are very few things that you “must” do. But there are things that are game theoretically advantageous to do – and thus the Nash equilibrium is that everyone does those things (validating being one example).
Validating is a bit more involved then just validating the transactions – checking the PoW, for example. The most complex part though, is validating transactions.
To check whether a transaction is valid, you’d often run into the problem of determining the latest state of the blockchain. Often, this is the balance of each address. However, it’s worth pointing out that depending on how you design your transactions, you might not need to determine the state (Bitcoin’s UTXO model is an example).
What we have so far is a ledger of transactions. How do we go from that to the state of the blockchain fast enough to be a competitive client?
- We could ask our peers for the state related to every transaction being validated. This is slow, vulnerable to malicious peers, and begs the question (how do these other clients know the state?)
- We could keep track of the state ourselves, locally. What does the memory requirement for this look like? The state also changes with every block, how do we manage so much memory requirement?
The Ethereum solution to this is pretty cool. After you’ve attempted solving this problem yourself, I highly suggest looking it up.
The canonical chain
We’ve thought through most major elements of a blockchain now, except for an important one: how the “true” blockchain, or the canonical chain, is determined. This problem is of high interest to us as miners, since work done on an orphaned “fork” is wasted.
So far we’ve only assumed that all new blocks a client receives will be canonical; but this isn’t true. Two different blocks can be successfully mined at the same time, resulting in a race to propagate across the network. Maybe some clients are suffering network issues and delays, so the new block you’ve just received is actually a child block of something two blocks back (“uncle block”). In practice, clients are not keeping track of a single blockchain, they’re keeping track of many different blockchains, all competing to be canon. So, which chain should you mine from?
There’s also a practical need for network consensus around the canonical chain. Otherwise, different clients would have different “canons”; each canon incorporates a different set of transactions, and thus has different blockchain states. A user requesting their balance from a node would get a different balance depending on which node answers the request. Then the same user might send transactions that are valid according to a client, but invalid according to another – anyway, total chaos.
Having said all that, I’m not going to offer or discuss any solution in this guide (and solutions are numerous online, anyway!). I simply aim to point out that this is a problem you’ll have to think about.
To build a blockchain (or more accurately, a blockchain client), you need:
- A way for clients to talk to each other
- An interface – who can talk to whom, and what can they say to each other?
- A way to receive new transactions, form new blocks from those transactions, and forward new blocks to the entire network
- A way to verify that blocks and transactions are valid
- Reasonable certainty that the network will “converge” – it’s no good if half of the network does one thing, a quarter another thing, and the remaining quarter yet another thing
- Finally, you need to do all that while being resilient to malicious attacks
What you build might end up look a lot like existing blockchains (or a simple version thereof). But that’s not the point. The point is to think through designs and trade-offs yourself, from first principle. The journey, not the destination. Enjoy!