Bitcoin, Cryptocurrency, Ethereum — you must have heard about these buzzwords. And maybe the stories of people making a quick fortune out of it, and why not — look at the market capitalization of so-called ‘Cryptocurrencies’ and how it has grown in the last year. Blockchain is the idea which drives them all.
Looking at the chart, you might think that you are late in the game, but a Blockchain-101 now could be like figuring out internet in 2000 — not too late.
This article is an attempt to provide an easy yet technical details of the Blockchain concept. Hopefully, it will help you understand the core fundamentals of this-new-thing-in-town!
“What is money?”
Money is an entity which can be used in exchange for goods and services. Then there is another system to keep track of its ownership and transactions — who owns what, who has what, and who owes how much to whom. That is all that money is.
We need a third-party trusted entity to keep track of money, to keep those transactions and deal with the conflicts, if applicable. But that trusted party i.e. the Government, comes with a cost in terms of efficiencies, potential for corruption, extra fees and so forth.
In simple terms, let’s follow these steps and see the money flow in a specific scenario in USA during 2008 where trust model did not work that well:
This is what happened in the crisis of 2008. Banks giving bad loans was the cause. Printing the money was a mitigation which helped in this specific case.
Six weeks after the crisis, on Nov 01, 2008, something happened that is actually going to reshape the financial world much more than the crisis itself — this guy (or these guys?) named Satoshi Nakamoto invented a decentralized (crypto)currency known as Bitcoin based off a novel concept known as blockchain. The idea was to create a world where any central authority doesn’t control all the money.
“I’ve been working on a new electronic cash system that’s fullypeer-to-peer, with no trusted third party.” — Satoshi Nakamoto
Bitcoin has its problems and it still needs to be tested with a crisis like 2008 but that is just one use-case (blockchain-as-a-payment) and it doesn’t defeat the purpose of blockchain and how it can cut middle-men in different aspects of our lives. With the historical aspects covered, let’s get into the technology behind the blockchain.
A blockchain is a decentralized, distributed and incorruptible digital ledger that is used to record transactions across many computers.
A bit complicated? Ok, Blockchain in simple terms is a:
And what is a crypto-currency? Is it same as blockchain?
Blockchain and Cryptocurrency are not the same but used interchangeably because of the invention which relates them i.e. Bitcoin.
Blockchain is a concept and cryptocurrencies are the applications using that concept to solve real problems. The complete list of cryptocurrencies by market cap can be found here.
I read it somewhere which correctly summarizes the relationship
“Cryptocurrency is to blockchain, what email is to internet”.
We will discuss about generic concepts but once in a while we will take examples from the Bitcoin to understand the blockchain better.
So, let’s break the aspects of blockchain and understand them one by one:
Everything stored on the blockchain is encrypted. BUT encryption alone is not a complete security and doesn’t make it incorruptible by any means.
So, what does blockchain do to make itself secure? There are three concepts we need to understand :
A typical hash function
Hash function needs to have some properties:
Well, looking at the image above, there is no way that a hash function can be collision free, right? Yes, that’s true. But the only motive is to ensure that it is practically impossible for available machines to find the collision.
This property is used in blockchain at the core level and we will discuss about them in a while.
2. Digital Signatures: Digital signatures are like our normal signatures in digital form. They need to have this basic property:
So, given a message encrypted with your secret key (aka password or private key, technically), there is a publicly available key (i.e. public key) which anyone can use and confirm that you wrote this message. No attacker can fake your signature unless your password is compromised.
This property is used in the blockchain to ensure that only the rightful owner can transfer the assets from his account and anybody on the network can validate the transaction. Bitcoin uses ECDSA for digital signatures.
3. Hash Pointers:
Hash pointer is another good data structure that is leveraged in the blockchain technologies.
A standard pointer is an address for the data stored, whereas, a hash-pointer also validates if data has been corrupted by keeping a cryptographic hash (hash function that we discussed earlier) of the data itself.
Hash Pointers
If someone changes the data in the block, the hash of that block will not match with what we have in the hash pointer and we can easily confirm the corruption of data.
So, this is how it is used in the blockchain along with a hash function to avoid corruption.
Block 1 (from the left) contains the data and a hash-pointer to the previous block. Block 2 contains a pointer and the hash of the first block and also its data.
Key point is that hash is the hash of data and the hash of previous block combined.
H(this-block) = H(H(previous-block) + data-in-this-block)
Now, what happens if someone plays with the first block?
Tampered results after first block.
If hacker modifies the content of the first block, hash stored in the block 2 doesn’t match and we can easily check that something has changed. So, attacker also has to modify the hash stored in the block 2. But in that case, hash stored in the third block fails and it keeps going until hacker modifies all the blocks in the chain till the last pointer.
Effectively, altering any unit of information on the blockchain would mean using a huge amount of computing power to override the entire network. And any block can validate whether this transaction is valid or not by following all the previous blocks in the chain.
There are better data structure to ensure that we don’t have to traverse every single block for validating the authenticity of a transaction. The one used by Bitcoin and Ethereum is known as Merkle Tree.
Merkle Tree
The idea is that instead of linear blocks, we will have a tree-like structure to ensure that we can quickly validate a transaction by going from root to the leaf rather than traversing all the blocks linearly. Basic concept although is similar to what we discussed in the linear structure.
There is a network formed by a random set of people who can voluntarily join by putting their machines to help others — just kidding — for the incentives. So, everybody is responsible for keeping the entire set of data and keeping checks on each other (using what we discussed in the “crypto” section). This is a basic peer-to-peer technology.
So, if it is decentralized & distributed, you may ask these obvious questions:
Those are the right set of questions and we will discuss them in the next section but let’s focus on the distributed part of the system.
Given that everybody has the entire set of records, you, as a hacker, don’t have a centralized place where you can go and modify the data like you can do with a bank. These records are also publicly available. So, you have an idea of how much money I have — kind of, as I will be identified as a 64 character random name. Can you change it? All you can do is try and fail and realize that it is not practically possible.
Well, with those benefits of distributed ledger comes the major problem of consensus among those nodes.
Consensus is a very hard problem in the computer science because of the failure cases (network failures/delays, systems failures etc.) and there are multiple researches that have happened to solve something as simple as “to reach to an agreed upon value” but blockchain solves this problem in a very innovative way i.e. incentives.
Incentives & Proof-of-work
So, in this distributed system, how do we decide who gets to decide the next transaction and why would anyone do that? And how do we ensure that this is not a bad (or malicious) node which is picking the bad transactions?
In a blockchain, addition of a new block (or let’s say transaction) and creation of the new coin is managed by computers doing the computational activity called Proof-Of-Work.
As we discussed earlier, each block contains multiple transactions and the first transaction in every block is a special transaction that starts a new coin owned by the creator of the block — whenever a new block is added by an honest computer, computer itself gets the reward. This adds an incentive for nodes to support the network, and provides a way to initially distribute coins into circulation, since there is no central authority to issue them. The steady addition of a constant of amount of new coins is analogous to gold miners expending resources to add gold to circulation. In this case, it is CPU time and electricity that is expended.
“Can we all add blocks and earn a bunch of money?” Not really.
The system (assume a software that anyone can install) provides a mathematical problem and the machine to solve it the first gets to add next transaction to the chain and gets the monetary benefits (i.e. in terms of bitcoins for the bitcoin-network) for adding that block in the chain. Rest of the machines just replicate the ledger and validate the authenticity.
“What sort of problem?”, you may ask.
It’s a kind of hit-and-trial problem where you have to guess the key of a hash function which produces a specific output. In a very simplistic terms, you have to guess a word which produces a hash (SHA-256) value starting with a determined output.
For an example that wikipedia takes, blockchain system wants the computing machine to find a variation of the string that starts with “Hello, world!” such that the string hashes to a string beginning with ‘0000’.
How do you solve this?
We vary the string by adding an integer value to the end called a nonce and incrementing it each time and trying all the combinations to validate if the hashed value starts with ‘0000’. Finding a match for “Hello, world!” takes us 4251 tries (but happens to have zeroes in the first four digits):
Attempt1 => "Hello, world!0" => 1312af178c253f84028d480a6adc1e25e81caa44c749ec81976192e2ec934c64
Attempt2 => "Hello, world!1" => e9afc424b79e4f6ab42d99c81156d3a17228d6e1eef4139be78e948a9332a7d8
Attempt3 => "Hello, world!2" => ae37343a357a8297591625e7134cbea22f5928be8ca2a32aa475cf05fd4266b7...............Attempt4249 => "Hello, world!4248" => 6e110d98b388e77e9c6f042ac6b497cec46660deef75a55ebc7cfdf65cc0b965
Attempt4250 => "Hello, world!4249" => c004190b822f1669cac8dc37e761cb73652e7832fb814565702245cf26ebb9e6
Attempt4251 => "Hello, world!4250" => 0000c3af42fc31103f1fdc0151fa747ff87349a4714df7cc52ea464e12dcd4e9
This is a simple example and can easily be solved but this could get complicated to ensure that machines take time to solve such problems.
The suffix which actually solved the puzzle i.e. “nonce” — is also added in the transaction to ensure that this can be validated by following nodes.
A couple of additional things to note here:
This number ‘x’ for bitcoin is ~21 million. Different blockchain applications have different implementations on this aspect and as an example, at the time of this writing, Ethereum is not limited to a fixed number and different other currencies use another algorithm called Proof-of-Stake.
In the blockchain, every coin (or entity) is defined as a chain of digital signatures. Each owner transfers the entity to the next by digitally signing a hash of the previous transaction and the public key of the next owner and adding these to the end of the coin. Obviously, there are a bunch of other things including the value associated with the transaction but they are not important for the discussion.
A simple transaction can be assumed like this:
A simplistic transaction
In the above transaction, anyone can verify if this is a valid transaction by using the hash pointer of the previous transaction. We can also verify if payer is the valid by verifying the signature provided in the transaction.
This process continues throughout and end-to-end validation can be performed to verify the chain of ownership.
Transactions in the chain
As you can see in the image above, every receiver can see whether the coin she is getting is valid or not.
The problem of-course is the payee can’t verify that one of the owners did not double-spend the coin i.e. owner did not spend the same coin at two different entities — like I can pay the same coin to buy goods and services from Amazon and Bestbuy.
To accomplish this without a trusted(or distrusted?) party , transactions must be publicly announced, and we need a system for participants to agree on a single history of the order in which they were received. The payee needs proof that at the time of each transaction, the majority of nodes agreed it was the first received.
Double Spending Attack
Let’s understand double spending attack with this image and how blockchain solves it.
The problem:
The blockchain-way of solving this:
The idea behind this concept is that the majority of computing power is controlled by honest nodes, as long as that is true, the honest chain will grow the fastest and outpace any competing chains. All blocks are cryptographically secure and to modify a past block, an attacker would have to redo the proof-of-work of the block and all blocks after it (concept of “hash-pointers” & “nonce” being added in the block, remember?) and then catch up with and surpass the work of the honest nodes.
So, the putting it all together looks like this over a network of millions of nodes:
Overall, Blockchain technologies are a novel concept and can profoundly change the way we organize our life. With the great concepts of cryptographic security, incentive-based distributed consensus, and a peer-to-peer network, it will not be an exaggeration to compare it with the dot-com boom of the ’90s and only time will tell ‘what’s next’.
BTC:
References: