Consider this: it would have cost $78,887.25 to store just 3 x 1 MB pictures on the leading blockchain recently. The key lesson from Satoshi Nakamoto, the anonymous creator of Bitcoin, is how an incentive system can be designed which can motivate various participants to work in a mutually beneficial way. While blockchain is a building block for such game theory-driven systems, it does not have universal applicability and needs to be used only as a state transfer machine or where full transparency is needed. This post dives deep into debunking the âmagicâ of blockchain, and puts it forward as an extension of time-series databases, as a way to manage trust.
Introduction
I am Prometheus. I am anonymous, so what I am saying must be magical ? :)
Well, heck no. I am just one of the thousands of people who has worked in the back-office trading data warehouses of Wall Streetâââwhere the concept of âtimeâ and âauditâ is well established. Every trade has a history attached to it, and using the concept of bi-temporal data modelling, it is possible to see the full audit trail associated with every single update to that transaction. The concept of blockchain extended that into the public using cryptography, which opened up new use cases, such as that of Bitcoin; and of a new blockchain altogether, Ethereum, to support even more use cases.
It allowed an amateur like me to dig into the transaction history of one of the ERC-20 based tokens and publish an analysis below:
Cryptocorn: How Whales And YouTube Influencers Drove Kin From $50 Million to $1 Billion Market CapâŚ_Kik is one of the most pioneering startups out there and is almost never reported by mainstream Canadian press. It wasâŚ_medium.com
But thatâs what I have done for over a decade in the world of time-series databases as well. There is no difference between the two systems if you have access to both. Both are meant to present an accurate historical record, to trace back the history; the difference is centralized, time-series databases are maintained with a mix of manual processes, whereas blockchains are maintained by self-driving game-theory driven incentives.
For this reason, the âblockchainâ concept had to have been born out of that world, by someone who understood time-series databases, was frustrated with the 2008 crash, and given the nature of the industry, did not want to reveal his identity. A former academic, definitely a PhD, who worked as a quant and maybe who got laid off in the 2008 financial crisis. There is no âweââââSatoshi had to be a single person, born out of that financial world.
But he or she forgot to tell people not to use blockchain instead of databases, that itâs not a swiss army knife, and in fact has a very specific use case to maintain trust. This is that missing user guide. Donât use blockchain, unless it makes sense; and please donât talk about âputting things on the blockchainâ when you have never worked with databases before.
Chapter 1: Blockchain Recap
Blockchain is normally viewed as a distributed, decentralized public ledger, made up of âblocksâ (or batches) of transactions âchainedâ together (one pointing to the other, using hashes). It functions by replicating every validated transaction across every node on the network. There is no central authorityâââeven if a few nodes go down, it doesnât disrupt or make the data unavailable.
As the network grows, and as the number of transactions grows, the overhead on each node to maintain a state of the blockchain grows significantly as well.
In Bitcoin, nodes running a certain piece of software can work to validate any newly created individual transactions by solving a cryptographic puzzle, and then compete to add them to the blockchain, thereby earning in reward tokens (bitcoin, in this case) for the âworkâ they performed. The resources required to solve this cryptographic puzzle (which keeps getting tougher) are measured in terms of electricity/resources consumed, which is what underpins the scarcity and value of bitcoin.
There are other blockchains which work on other principlesâââbut the core idea of a decentralized, self-driven, reward token-based system which uses cryptography underpins all of them.
Chapter 2: Sample costs of an âAirbnb on a Blockchainâ venture from a user point of view
These are based on median gas price of 30 gwei; and ETH/USD all time high price of when 1 ETH = 1402.44 USD, modelled using Danny Ryanâs spreadsheet (post and link below)
It would have cost $78,887.25 to store 3 pictures (1MB each) on the Ethereum blockchain recently. It really shines as a state transfer machine, and that is its only financially viable use case. For all other purposes which involve data storage or computation, those tasks are best left âoff-chainâ, to regular centralized databases/servers.
$0.02 to âtweetâ using a blockchain-driven app ?
Source: https://blocksplain.com/2018/04/17/peepeth-decentralized-twitter/
For example, consider thisâââa version of Twitter on the blockchain, where people have to pay transaction costs to post messages: ~$0.02 per âtweetâ.
How do such products which now charge for what people get for free elsewhere breakthough beyond the initial crypto community ?
Donât use Ethereum for computation or storage unless necessary
The Ethereum blockchain is financially unviable for consumers/end users purely as a datastore or computation engine. Consider these regular storage and computation scenarios where blockchain is used versus what Airbnb provides this data entry for (hint, its $0). Compare these costs against AWS as well.
Here are the core functions which such a service needs to support:
- Create a profile
- List a room (with pictures, and description)
- Bring viewers to that posting/âcreate the marketâ
- Set availability
- Chat
- Facilitate payment
- System to leave a review and view reviews left by others
- Incentive to share and refer
Upload 3 pictures of your property as a host (1MB each): $78,887.25
This costs $0 to the host on Airbnb. But with such staggering costs on Ethereum, this only makes sense if this was the room Mona Lisa slept in prior to the day her painting was created.
Write a brief, 50 word host review:Â $50
This costs $0 to either the host or the guest on Airbnb. If it costs money to write a review, why would anyone write reviews ? How does the system work without reviews ?
Fill (save) out your bio (100 words):Â $100
This costs $0 to either the host or the guest on Airbnb. Why fill this out if it costs moneyâââin which case, again how do you build trust on the system ?
Compute reservation cost (price per night X # of nights):Â $0.0002
This costs $0 to either the host or the guest on Airbnb. Please do this on the client side. Ethereum blockchain should not be used/abused for such use cases.
Submit a booking request: The âcreationâ of a booking request alone will cost $1.35
This costs $0 to either the host or the guest on Airbnb. But hey, if you are doing this on the blockchain, remember to add a âBlockchain feesâ to the below table:
Everytime a transaction is sent (host accepts or declines): $0.88
This costs $0 to either the host or the guest on Airbnb. Someone declined your request, hey you still have to pay close to a $1 for that in the blockchain world.
Compare this with Airbnbâââwhere each of the above costs are $0 to both the host and the guest. Some could make the case that hashes of user data could be stored on the blockchain instead of the data elements themselvesâââbut whatâs the point ? The cost of that will still be greater than what Airbnb charges for such data entry: $0.
From a user point of view, what difference does it make where/how this data is stored, as long as the data read/write is instantaneous, backed-up and always available. The user is trusting the brand of Airbnb to manage this. It is marginally better for the user to have full control/independence over his/her dataset, but it is not essential for the value needed from the service.
An alternate world where bulk of this data is hosted on a blockchain would mean exceptionally slow user experience. The user would also have to trust the brand of the underlying blockchain (say, Ethereum). Not to mention the core startup activity of actually building out a beautiful product experience along with driving distribution would still need to be doneâââthat is step 1 in any case. The competition with Airbnb is not on price, but the overall experience.
Chapter 3: DApp itâââTrust dilemma with storing user data on other usersâ computers
Filecoin architecture
As shown in the previous section, using the Ethereum blockchain for either computation or storage in these use cases doesnât make sense.
Blockchain startups get this. At the end of the day, regardless of what their service is called, how safely they encrypt it, and manage access to itâââtheir alternate proposal calls for storing this data on other usersâ computers.
If you were explaining to your grandma, this is what you would say.
Airbnb stores userâs data on databases it manages and controls. If there is an issue with your data, it is responsible.
âAirbnb on the blockchainââââit stores your data on other usersâ computers. It does it efficiently and safely, but that is what decentralization means. Its not cheaper, but its a question of choice.
Does it sound okay to your grandma Alice that your pictures, reviews, bio, and every thing you write or upload, is stored on Bobâs computer ? Yes it is behind a lock and a key, and Bob canât read itâââbut in principle, does that sound like a good idea, to your grandma, not to your inner crypto enthusiast ? Here is the question in grandmaâs head:
Do I trust Bob or do I trust Airbnb ?
That is the key trust issue of our times, which users will vote on with their wallets.
Do they trust centralized brands and access, or are they ok with decentralization storage ? I donât care either way. âWhereâ data is stored is irrelevant, what matters is, what is the value which the user got.
At the end of the day, if the service quality is not as cheap, fast & efficient as what Airbnb already provides, any âAirbnb on the blockchainâ startup is a no-go in any case. The service has to be 10x better for users to switch. So it will be fun to watch these blockchain startups raise tens of millions, and then realize, they still have to make an overall product + value proposition which the users want. Vast majority of users do not care âwhereâ the data is stored, and in fact if you were to tell them that it is stored on Bobâs computer and not with the company itself, they would not be able to trust you.
Chapter 4: Blockchain does work exceptionally well as a system to transfer state (such as money)
On an example $350 total reservation cost, the guest and the host together end up paying about $63 in fees to Airbnb (lets assume max of 15% charged to the guest; 3% to the host).
If the transaction was done only in ETH, where both the host and the guest had an ETH wallet, then the transaction cost would have been around $0.88 to the guest (based on the cost to send 1 transaction on Ethereum with the above assumptions). To get it done a little faster, more transaction cost would have been paid.
If the guest was paying in USD to purchase ETH and place in his wallet; and then ETH transferred to the hostâs ETH wallet, and then the host converted ETH to USDâââthere would have been additional transaction costs mixed in.
But this shows where the real utility of the Ethereum blockchain liesâââas a state maintainer, which in this case happens to be the state of the guestâs account, and the state of the hostâs account. For this use case, using the blockchain is far preferable.
As a subtle reminder, users donât care about the underlying tech, they care about what do they get out of it, what is the totality of their experience with the service.
Hybrid really means 99% off-chain + 1% on-chain for state management only
The Future of Hybrid Centralized-Decentralized Apps_Early last week, it was reported that the messaging app Telegram is working on raising capital for an audaciousâŚ_www.theinformation.com
An Airbnb on blockchain would really be a mix of:
- 99% centralized service, where profiles, reviews, pictures, etc are saved for cost and efficiency. Maybe, this is the part which transitions to a decentralized storage architecture, if parity is maintained for speed and cost.
- 1% for value-transfer between the guest and the host, which is done on the Ethereum blockchain, which provides tremendous saving in cost, and the immutability and verified nature of value transfer which occurred.
Disrupting a well-established, centralized startup like Airbnb is thus practically unfeasible for a purely decentralized blockchain-driven startup. It will have to compete on product UI/UX, network effects, insurance/guarantee, and other factors. The overall product experience is what matters, and everything else being equal, Airbnb is significantly cheaper to use than a pure âAirbnb on a blockchainâ play.
Chapter 5: Reward tokens, for example, what if Medium had a âclapcoinâ
Reward tokens can be used to structure the economy of a productâââcreating suitable incentives for the participants to engage in certain behaviors which is mutually rewarding for them and for the system. Think of Airmiles, but linked by tokens enabled by a blockchain. In the Ethereum world, those are ERC-20 based tokens. This is basically a set of functions with very basic capabilities (check balance, transfer, and so on), a variable name defined to be the tokenâs name, total supply, and decimals to be used for display.
This article is a good primer on ERC-20 tokens:
Understanding ERC-20 token contracts_Looking at Ethereum tokens and explaining the features and functions of ERC-20 to provide an understanding of tokenâŚ_medium.com
Drive certain user behaviour with well-designed, game-theory driven systems across a range of domains
A core use case for blockchain is the reward token, which determines how the economy of your product is defined. In bitcoinâs case, sufficient incentive was created for miners by leveraging cryptography. However, a reward token based system doesnât have to use cryptography-driven reward; it could be something else altogether. For instance, Medium can introduce a clapcoin, which would do the dual function of both signalling good posts and rewarding the author automatically as well. It could require certain number of tokens to be able to comment on posts as wellâââagain building out a self-driven incentive system. There is no need for manual, human moderation. Pair with an A.I. driven-moderation agent, and other humans moderating content due to the built-in incentives, and suddenly you have a safe space on the Internet to communicate with others and have good discussions.
Maximize database, minimize blockchain usage
The best way to use the blockchain is to use it only for the minimum absolute need, mixed with a relational database which does bulk of the heavy lifting. There is a cost in time and money involved in writing transactions to the chain which is significantly more expensive than simple database writes. An intelligent blockchain strategy would involve utilizing the database for as much as possible and handing off transactions to the blockchain only when absolutely needed.
In this example there is no need for Medium to store the posts, comments, etc on a blockchain. Even storing every clap on the blockchain right away would not make sense as it would be terribly inefficient given blockchainâs fundamental design: instead single month-end aggregated clapcoins collected in the database could be written to the blockchain.
Chapter 6: The truth is out thereâââbringing full transparency and accountability
Itâs unconceivable to imagine a future where governments in democratic countries do not move towards widespread adoption of blockchain across a range of government services, from maintaining land registry, to other functions.
Or how about voting ? Shouldnât every vote be tied to a singular eligible voter, whose identity is not known but is validated in the system ?
Itâs also hard to imagine a future where blockchain is not used across the non-profit sector as well. Lets say if every âdonatecoinâ you gave can be tracked through where and how it was allocated by the charity you gave it too. If too much is going to administrators, then it would be known without requiring an internal whisteblower to leak such secrets.
While there is significant overhead involved with using a blockchain, and someone has to pay for that, but where trust and accountability has a higher need, it would be justified.
In the private sector though, question worth asking is, if data can be stored on centrally managed databases, and exposed via APIsâââwhy is blockchain, with its associated overhead, better ? Lets debunk the magic of blockchain a bit..
Chapter 7: Blockchain is an extension of existing, widely-used concepts in the financial data world
Trading data warehouses are basically modelled like what would be âcentralized blockchainsâ. All the concepts except the part about storing data on multiple distributed peers extend from bitemporal modelling, which has been done in relational databases for a long time. This is the fundamental basis of reporting data warehouse systems especially for the financial world and is a well known best practice. These are the key features of such systems:
1. Immutable records along with a timestamp, blocked together, providing a view into state âas ofâ that time
âWhat was the state of this data at any point in time in the past ?â
The time every transaction is entered into the database is recorded, along with a unique batch identification number in which all such transactions were processed. This allows one to query the state of the system at that point in time. Immutable due to process, because *no one* is supposed to alter that fundamental unit of information about a transaction. A hash of the transaction is stored, and which is what is used to compare and track changes.
Transactions, blocked together (held by a unique block id every time a data load process runs), with their timestamp being recorded and being immutable (due to established procedures). Given these block ids are sequentialâââyou can trace backwards from transaction blocks if needed.
2. Correct historical records with a complete new row instead of overwriting prior fields
âWhat is the most accurate representation of this data right now ?â
A second feature of such systems is that other transaction data can be corrected at a later point in time, which is useful for financial reporting systems.
This correction is not done by updating a field of a particular record directly, but instead by inserting a brand new transaction record which has the updated value along with all the prior unchanged fields, and a new timestamp recorded along with a new unique block identification number. The earlier, incorrect record is made unavailable for further queries; and at each point in time there is only one true, correct representation of data.
Using #1 above however, you can still go back in time to trace the history of updates.
An exampleâââfinding the core truth about a transaction
What bitemporal modelling looks like in standard relational databases
In the above example, we can ask the system:
- What was the state of this particular trade as of Jan 10, 2016 at 9:00pm EST (where it was â10â), who entered it ?
- What was the state of this particular trade as of Jan 10, 2016 at 10:00pm EST (where it was â11â), who entered it, and who updated it ?
- What is its current state (where it is â12â)Â ?
In each such record, there is also the history of what other transactions was it grouped with during the data load process (which is the unique, system generated block IDâââa single data load process might load 100 such transactions for instance), what is the unique transaction ID, exactly at what time the record was inserted, by whom, and so on. These form the âcore truthâ about each such transaction.
These systems predate the blockchain and are used widely in the financial world. However, such time-series databases have a significant efficiency overhead, which makes their use case very limited. For blockchain, not only is it âtime-seriesâ in spirit, it is also mass-distributed and replicated, significantly adding to the overhead.
Everything which can be stored in a centralized relational database can be stored in a blockchain, with the added benefit of linking them to historically verifiable tokens which can then be traded on exchanges and used as in-product reward tokens.
Chapter 8: Manual processes are simply automated in a blockchain based on system design
1. From centralized databases -> decentralized blockchains
The blockchain concept, as popularized by Satoshi, made away with the #2 above altogether as it is more useful for reporting systems. It also provided the core invention of how can you have these transactions recorded/verified by a distributed set of nodes and built in an incentive mechanism for them to work. The how of that incentive mechanism involved using cryptography and certain rules to reward people to build and run such a network.
In a centralized database, the record history is immutable by process. For example, such unauthorized updates would lead to the person getting fired, sued or both! In a blockchain, the record history is immutable by designâââwhere even deliberately the record cannot be changed. The tracing history exists as it does in bi-temporal modelled relational database design, along with the concept of block IDs, among others.
2. From stored procedures -> smart contracts
In a relational database like Oracle, code can be written in stored procedures and made to run automatically upon a certain set of conditions being met. This can query and update records as appropriate. Stored procedures are the boring workhorses of a relational database, normally full of convoluted code.
âSmart contractsâ as enabled by the Ethereum blockchain replicate that functionality. They could have easily been called stored procedures, but wonât have been as exciting. They are âsmartâ, and they are âcontractsâ! Well, all code is. It doesnât go out for a coffee break when you run it. Letâs not worship itâââits not suitable for all use cases either.
Regular database stored procedures donât need to be âpaidâ to be able to run them (apart from whoever is paying for the server costs), but smart contracts on a blockchain do, as they are tapping into the resources of the nodes.
3. From in-app currencies -> blockchain-based tokens
A lot of apps and organizations run their own reward programs. The main difference with blockchain tech has been that now that reward point is called a token, and is needed for the underlying functioning of the smart contract and the blockchain.
For example, its like if you collected 100 airmiles, and then you had to pay 5 airmiles just to make a booking on a website, or if you wanted to send 80 airmiles to grandma on her birthdayââânow you can do that with a blockchain-based token. And if for some reason you also needed an open audit trail associated with your use of these airmiles, then this is the perfect solution.
What if speculators engaged in buy/sell of such airmilesâââand if your holdings increased in value over time ? That then turns into another asset class for you, another way to earn wealth, from something which was completely locked into its own database sitting in obscurity, now you have the ability to trade it.
Chapter 9: Why not to use a blockchain ?
1. Blockchains are incredibly inefficient: slow and ridiculously small per second transaction ability
Users have come to expect relational database level read/write speeds. To ask them to wait because of underlying technology would simply be a no-go. Unless blockchains are able to provide that same level of performance, their real-world use cases for end-consumers would remain theoretical; pumped up visions only used to sell tokens to unsuspecting investors.
Consider that Ethereum supports just 15 transactions per second! From the scaling article below: âIt depends on a network of ânodesâ, each of which stores the entire Ethereum transaction history and the current âstateâ of account balances, contracts and storageâ. Its designed to be inefficient.
Why would you want your consumer-oriented product to be built on such a network ? Why would you choose it over a relational database hosted on a cloud platform like AWS or Google Cloud ? Do you still download movies and songs on BitTorrent or prefer the much faster service provided by Netflix or iCloud ?
Its worth asking the question: why does using the blockchain in this case make any sense from a user perspective ? Does it make their life better or worse when they use your product ?
How Will Ethereum Scale?âââCoinDesk_Like other public blockchains, ethereum intends to support as many users as it can. The problem is that, today, we donâŚ_www.coindesk.com
Even though centralized databases are expensive and difficult to maintain; for enterprises the logical migration path is moving to the cloud. They need speed, consistent uptime, support and have been building systems to provide accurate statement of truth about each transaction based on internal processes, without a blockchain.
2. Privacy for certain use cases: no one wants their full balance and transaction history for everybody to see
How would regular users react when they find out that their entire transaction history, and the amount of their wallet is publicly available for anyone to search through ? Even though it is not linked to their nameâââstill at the end of the day does anyone want their bank account number, with balances and transaction history published online ?
And since the knowledge of the private key is the only thing holding back transfer of fundsâââbad actors can exploit/threaten remotely to gain access and transfer your funds; unlike a bank which would at least put up a fight for you as its reputation depends on that.
This also means enterprises would go for a private blockchain at best. There would eventually be an âOracle of blockchainsââââa company big enough and fully committed to providing optimized deployment of a private blockchain, including support and consulting services. It might help in inter-company connections as well.
Blockchain-as-a-Service: Losing the plot ?
Such deployments, provided by leading cloud providers would help to alleviate some pain points while providing benefits. But then again running a decentralized network on a centralized network, and you have to ask: what is the point and where does it add value ? Why not just open up your relational database with suitable access only in that case ?
Conclusion
We are in an era of âcrypto-drivenâ business model disruption. However, blockchain for the heck of it is a road to nowhere.
We donât need to ever know who Satoshi isâââwhat we do know is that the core idea pioneered with the deployment of Bitcoin just makes a lot of sense, and that idea is how can you design a well-rounded, self-driven incentive system for your product ?
This is as good as time as ever to design intelligent blockchain-driven ecosystems which make use of reward tokens and bring greater transparency and smarter systems. Humans and A.I. would fit into this world of smarter systems. However, blockchain is a hammer, not a swiss army knife, so please use with caution :)
Thank you for reading
Appendix: Source for Ethereum usage cost
Calculating Costs in Ethereum Contracts_GAS PRICE PSA (2017â08â23): The median gas price at the time of writing this article was 28 Gwei, and continues to beâŚ_hackernoon.com
Spreadsheets shared by Danny
https://docs.google.com/spreadsheets/d/1KeWKkn0BYhOt1p6lM6BDQAWLin-2JQmGpwswU3kPw9c/edit#gid=0
https://docs.google.com/spreadsheets/d/1n6mRqkBz3iWcOlRem_mO09GtSKEKrAsfO7Frgx18pNU/edit#gid=0
Breakdown
- Adding two integers: 3 gas units
- Multiplying two integers: 5 gas units
- Base transaction: 21,000 gas units
- If you are interacting with a contract which multiplies two integers: 21,000 gas + 5 = 21,005 gas units
- 1 Gwei = 0.000000001 ETH
- Gas price specified in gwei/gas; current median 30 gwei; some pay less, some pay more for faster transaction processing
- Total fee paid per transaction = Gas price * Gas units used