11000:1 Ethereum Query to Transaction Ratio! What’s going on?
During the February 20, 2018 event organized by the SF Ethereum Developers with Grant Hummer and Ken Fromm as hosts, Eric Tu from ConsenSys gave a presentation on Infura. Infura is a project of ConsenSys and acts like an Ethereum supernode for thousands of decentralized applications to access Ethereum. It was started to support internal projects within ConsenSys and then launched publicly at Devcon2 in Shanghai.
A data point shared by Eric in his presentation impressed and surprised me. Infura processes 6B queries per day (QPD) and continues to grow quickly. I was particularly struck by this query volume in comparison to Google, which does 3.5B QPD. Admittedly, the query use case is different between Google and Infura, but there are definitely similarities in the tech capabilities needed to handle massive query volumes. Infura has become the runaway market leader in this space, so I believe that a fair estimate is that Infura handles 70–90% of all Ethereum queries and 10–20% of Ethereum transactions.
Let’s analyze the data further to try to make some sense of it. If you do the math, at 90% of query volume, Infura’s 6B QPD represents a total query volume of 6.7B QPD, which equals about 77,000 QPS (query per second). According to Fred Ehrsam, there are about 7 TPS (transaction per second) on Ethereum. So the query to transaction ratio on Ethereum is:
If Infura is closer to 70% of query volume, that ratio is even worse at 14000:1. Since Infura only handles 10–20% of Ethereum transactions for their 6B QPD, the ratio on Infura is between 50,000:1 and 100,000:1.
So what can we take away from that ratio?
First, we should all give a round of applause to the awesome Infura team! With such a small team of 10 people, Infura has done a fantastic job to support 6B QPD. As an ex-Googler, I know that the Google team to support search is at least 10x larger than the Infura team and requires a fair amount of hardware. The ability to handle that query volume is a testament to Infura’s technical capabilities. Can you imagine what Ethereum would look like without Infura?!
Second, why is the query to transaction ratio unreasonably high? What’s going on with Dapp implementations? There is no reason for a Dapp to query the Ethereum blockchain 11000 times before it makes a transaction, but there is also no forcing function to encourage a better ratio. Is the ratio caused by poor Dapp implementation which can be improved by caching data since most data are not dynamic? Or is the ratio caused by the poor current Ethereum architecture design which forces Dapp to keep querying Ethereum every second even though a transaction may not be mined until the next block? Are there particular Dapps or types of Dapps that generate the most query traffic? This feels like an area ripe for optimization in order to ensure that query volume doesn’t become an even larger bottleneck.
Third, why does Ethereum have such a HIGH volume of query traffic only three years in? And does this affect Ethereum scalability? Fred Ehrsam’s comparison between Facebook requests and Ethereum transactions in his article represents an apples-to-oranges comparison since most of Facebook requests are “read” not “write”. Infura has proven that the query (e.g. read) side of Ethereum is scalable even though Infura could make it more decentralized by open-sourcing the code and encouraging others to offer similar services. (According to Michael Wuehler from Infura, they have plans for doing this.) For the transaction (e.g. write) side of Ethereum, like ConsenSys’s approach of starting Infura, can Dapps fork Ethereum and run their own chain first and migrate data to the Ethereum mainchain when Ethereum solves the scalability issue for transactions? The truth is that at the moment, consumers don’t really care whether the backend of a decentralized application is decentralized or centralized. They simply want the Dapp to work.
Fourth, Infura’s dominance helps us assess this query-to-transaction ratio. Just like Google’s dominant market share in search provides it enough data to generate useful predictions, such as Google Flu Trends, Infura’s market leadership allows us to better understand Dapp behavior on the Ethereum blockchain. Although we don’t know exactly how many total queries happen and can only estimate Infura’s market share of those queries, it raises the question of the value that a centralized service like Infura provides that a fully decentralized service cannot provide.
I believe the whole Ethereum community will benefit a lot from having more data. Today it seems nobody has a good estimate on the total volume of Ethereum queries and hence it makes it harder to figure out what’s really going on with the network. So if you want to make the Ethereum community better, please submit your data via this form. I will compile them to share back with the community.
I cannot answer all the questions above, but I believe it’s worthwhile to get people aware of the data and start a discussion. Hopefully, the whole Ethereum community will benefit from the conversation.
Thanks Michael Wuehler, Eric Tu, Daniela Osorio, Joseph Chow, Chris Peel, Ken Fromm, Brandon Bidlack for their reviews, comments and suggestions.