paint-brush
Under the hood of crypto price aggregatorsby@THE_SILICOIN
852 reads
852 reads

Under the hood of crypto price aggregators

by THE SILICOINJanuary 15th, 2019
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

It is frustrating that cryptocurrency movement relies on completely opaque pricing data. How is it possible to take any decision based on data from a black box?

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coins Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Under the hood of crypto price aggregators
THE SILICOIN HackerNoon profile picture

Cryptocurrency price aggregators are black boxes — no one really knows how they calculate average prices.

However available info is enough to understand that they do it incorrectly.

We uncover the problems and propose a professional pricing tool that is 100% transparent.

It is frustrating that cryptocurrency movement relies on completely opaque pricing data. How is it possible to take any decision based on data from a black box?

Over a year ago we started developing our own crypto price aggregator THE SILICOIN to fix this problem. We solved numerous methodological and technical issues during this year — now we want to share our findings:

  1. Aggregators pool trading pairs from all exchanges into one bucket to calculate average prices. This approach distorts prices. We should first find prices of coins on each exchange and then find averages across exchanges (see Section 2 of this article)
  2. There are many ways to find USD price of coins that do not have direct trading to US dollar. Most of them are inappropriate. We discuss available options and explain why step by step pricing is optimal.

We discuss these findings below. They gave birth to the pricing methodology used by THE SILICOIN. Our methodology is described in maximum detail in Section 4.

This material is lengthy. However, we are sure that it is worth investing 15 minutes to understand once and forever how data that we look at every day is calculated.

If you have questions about this article, you are welcome to ask them in the comments.

Contents

  1. Grey areas and problems of CoinMarketCap’s methodology
  2. Why pooling all trading pairs in one bucket is wrong




  3. How CMC may be calculating reference prices3.1. Option 1 — step by step propagation3.2. Option 2 — iteratively recalculate prices for skipped pairs3.3. Option 3 — use previously calculated prices as reference3.4. Which option CMC is likely to use?






  4. THE SILICOIN coin price calculation methodology4.1. Methodology summary4.2. Methodology results4.3. Detailed methodology4.4. Why use the most liquid trading pair to calculate coin price on exchanges4.5. How to observe all inputs used for calculations

  5. Conclusion

1. Grey areas and problems of CoinMarketCap’s methodology

There are major unknowns and problems in CoinMarketCap’s methodology. We are talking about CMC not because its methodology is particularly bad, but because it is a market standard.

All crypto price aggregators that we know use some variation of CoinMarketCap’s black box methodology.

At first sight CMC’s methodology seems straightforward:

Step 1. Receive data for all trading pairs from exchanges, apply cleaning and verification algorithm

Step 2. Pool all trading pairs and find USD price and volume for each pair

Step 3. Calculate volume weighted average price of coins across all trading pairs


Methodology for Step 2 is as follows:“The price for each individual market pair is calculated by taking the unconverted price reported directly from the exchange and converting it to USD using CoinMarketCap’s existing reference prices



**There are two major issues with CMC’s methodology:1. It is incorrect to aggregate trading pairs from all exchanges into one pool.**As we show in Section 2 below, this approach yields ambiguous and abstract prices



**2. How are reference prices used in Step 2 calculated?**We don’t even know what trading pairs are used to price Bitcoin.Calculation options are described in Section 3

On top of that we have no idea about what “data cleaning and verification algorithms” are used. It is ridiculous, but it is the least important problem.

2. Why pooling all trading pairs in one bucket is wrong

As discussed above CoinMarketCap pools trading pairs from all exchanges and calculates averages across all of them regardless of what exchange a pair belongs to.

From what we could find about methodologies of other crypto price data providers, they also pool trading pairs (example 1 — Nomics, example 2 — CoinPaprika, example 3 — Bitwise).

Approach of pooling all trading pairs is flawed. It outputs incorrect prices.


The alternative to pooling all trading pairs is to group them for each exchange. Every coin traded on an exchange should receive a unique price. Correct: Bitcoin costs $4,000 on Binance. Incorrect: Bitcoin costs $3,990 on BTC/USD and $4,010 on ETH/BTC trading pairs on Binance.

A bit of theory:

If there are no market frictions, arbitrage opportunities do not exist. That means that assets have the same price across all markets where they are traded.

There are severe market frictions when moving funds from one exchange to another. That is why it makes sense to calculate average price of cryptocurrencies across exchanges.

On the contrary, within one exchange frictions are minimal. Therefore, if we take trading pairs with some coin on a particular exchange — price of this coin in each pair should be the same.

Consider an example:

  • Assume there are 2 trading pairs on exchange “XYZ”: MKR/BTC=0.092 and MKR/ETH=3.60

  • First we calculate USD price of BTC and ETH using trading pairs from other exchanges where they are traded to fiat. Let BTC=$3,400 and ETH=$90 (these are infamous reference prices — to be discussed below)

  • Let’s calculate Maker’s price in both pairs. From MKR/BTC price is $3,400*0.092=$312.8, from MKR/ETH price is $90*3.60=$324.0

  • We arrived at two different prices for the same crypto on almost frictionless market. In reality it is not possible! Traders immediately discover any deviations and eliminate them.

  • Why that happened? We introduced 2 outside prices to a single exchange. Traders on this exchange value BTC relative to ETH differently than an average trader that we found by calculating reference BTC and ETH prices
  • To eliminate this problem we should find one best price for Maker on this exchange. It should be derived from the most liquid pair that has USD price already established
  • Assume trading volume of MKR/BTC pair is higher than of MKR/ETH pair. Than we should take BTC price of $3,400. MKR=$3,400*0.092=$312.8, ETH=$312.8/3.6=$86.89 on exchange “XYZ”

This problem can be easily illustrated. As per CoinMarketCap, Tether (USDT) price on BITTREX differs by over 3% between trading pairs, while transaction fee is only 0.25% and trading volume is substantial. That is clearly not possible — such deviation would be immediately exploited by arbitrageurs.

Source: CoinMarketCap. Price difference may fluctuate

3. How CMC may be calculating reference prices

When pooling all trading pairs reference prices are the major input — they have huge effect on prices of all other coins.

Although we’ve already showed that pooling is wrong, let’s discuss how reference prices may be calculated. That helps in grasping the intricacies of pricing cryptos.

3.1. Option 1 — step by step propagation

Step 1. First calculate volume weighted average USD price of Bitcoin across BTC/fiat trading pairs (Bitcoin has the largest number of pairs to fiat currencies)

Step 2. Calculate USD prices for trading pairs to BTC or fiat (coin/BTC or coin/fiat). BTC’s price is the reference price. Then calculate volume weighted average prices for these coins. Not all coins have trading to BTC or fiat. Therefore, not all coins are priced so far

Step 3. Calculate USD prices for coins that have trading pairs to coins priced in Step 2. Coins that were priced in Step 2 provide reference prices in Step 3

Step 4. Continue such iterations until all cryptos receive USD price

Note that by far not all BTC trading pairs will be used to price Bitcoin under this option. Pair sample will be limited for all coins except those priced in the last step.


Consider an example for this procedure:Assume we pooled the following pairs from all exchanges: BTC/USD, ETH/BTC, ETH/USD, LTC/BTC, LTC/USD, LTC/ETH, MKR/ETH, MKR/LTC, ADA/MKR

  1. Find BTC price in USD using pair BTC/USD
  2. Find ETH and LTC prices in USD using ETH/BTC, ETH/USD and LTC/BTC, LTC/USD
  3. Find MKR price using MKR/ETH, MKR/LTC
  4. Find ADA price using ADA/MKR

Option 1 — step by step propagation

Pair LTC/ETH is not used at all. Pairs ETH/BTC, LTC/BTC not used to price Bitcoin. Pairs LTC/ETH, MKR/LTC not used to price Litecoin, etc.

3.2. Option 2 — iteratively recalculate prices for skipped pairs

Step 1. Similar to Step 1 of option 1

Step 2. Similar to Step 2 of option 1

Step 3. Recalculate prices of coins from Step 2 using all available pairs (only BTC price is fixed), find updated reference prices

Step 4. Calculate prices of remaining coins

Step 5. Recalculate again

Step 6. Continue such iterations until all cryptos receive USD price and all pairs are used

Under Option 1 many pairs are not accounted for. Option 2 tries to fix the problem. Let’s consider the example described above using Option 2 approach:

  1. Similar to Step 1 of option 1
  2. Similar to Step 2 of option 1
  3. Recalculate ETH and LTC price using LTC/ETH pair not used in Step 2. One can fit ETH dollar price found in Step 2 into the equation and get LTC dollar price and LTC/ETH dollar volume. Now we can recalculate LTC average dollar price using three pairs. Then one can fit LTC dollar price found in Step 2 (or even a recently recalculated LTC price) into LTC/ETH equation and get ETH dollar price and new trading volume from this pair
  4. Price MKR
  5. Recalculate LTC and ETH price again
  6. Price ADA

Option 2 — iteratively recalculate prices for skipped pairs

In Step 3 we generated a wonderful randomness by arriving at two different USD trading volumes for LTC/ETH trading pair which is colored red. 1st obtained by using ETH to find dollar values of LTC/ETH pair, 2nd by using LTC to find dollar values of this pair.


We can make infinite number of iterations. After each iteration we will arrive at different prices. We can pray for convergence, but even if that happens resulting prices are…whatever but market average.

We can also modify options 1 and 2 by first fixing USD prices for several top-tier coins (not only BTC) or try iterating calculations in any step for any number of times. However, we will always either omit some pairs or generate randomness by iterating.

3.3. Option 3 — use previously calculated prices as reference

This is the worst option because prices become sticky and change slowly.



Assume we have only two pairs: BTC/USD and ETH/BTC. Let previous average price of BTC=$4,000, previous average price of ETH=$100. Current BTC/USD quote is $3,000, volume is $9,000 and ETH/BTC quote is 0.027 BTC, volume is 1 BTC.

From ETH/BTC=0.027 → BTC=$100/0.027=3,703.7


Then new volume weighted average BTC price is (3,000*9,000+3,703.7*(1*3,703.7))/(9,000+1*3,703.7)=$3,205.16Average ETH price is 0.027*4,000=$108

This algorithm outputs BTC price of $3,205.16, while it should be $3,000. Outputs will eventually converge to actual prices, but that will take a lot of time.

3.4. Which option CMC is likely to use

We tried to reverse engineer CMC’s average BTC price from the data provided in the markets section of CMC’s Bitcoin page. We experimented with several approaches:

  • Include all markets except those marked as excluded from average price
  • Include only fiat markets
  • Combinations of these

None of the approaches gave us average BTC price that we saw on CoinMarketCap.

We did the same with a smaller coin — Maker. In this case average price appeared to be just volume weighted average as promised.

Thus, CMC’s approach looks like some variation of Option 2.

Google spreadsheet with our calculations can be found here. Please try making your own calculations and let us know if any remarkable results.

4. THE SILICOIN coin price calculation methodology

(or how to calculate average coin prices properly)

Below we lay out methodology for calculating average prices that cures the problems described in Section 2 and Section 3 above.

This methodology is 100% transparent, robust and uses real transaction prices.

Input data for calculations is available on THE SILICOIN — you can always verify our calculations. For example, see Exchanges section of Ethereum pagefor details of Ethereum average price calculation.

4.1. Methodology summary



  1. **For each exchange calculate price of each coin traded on this exchange**1.1. Price of a coin on particular exchange is established from the best trading pair, i.e. pair that has minimum number of steps to fiat and maximum transaction volume1.2. For exchanges that do not have fiat trading use one top-tier coin to establish a link to fiat

  2. Calculate total trading volume of a coin on particular exchange as sum of volumes across all trading pairs with this coin on this exchange
  3. Calculate volume weighted average coin prices across all exchanges

4.2. Methodology results

Benefits of this methodology are clearly visible.

Let’s compare Tether (USDT) prices on CoinMarketCap and on THE SILICOIN.

There are only 2 exchanges that offer liquid USDT trading to USD — Kraken and Bittrex. USDT/USD price on these exchanges indicate true market Tether’s price.

CMC arrives at inflated USDT prices on most trading pairs except USDT/USD markets that are priced directly from fiat. See screenshots below.

As of 18 Dec 2018 Source: CoinMarketCap

**With our methodology we arrive at stable prices across all exchanges.**Moreover, users can always verify that average price of a cryptocurrency is calculated by multiplying Price by Adjusted share (see full methodology below for details).

As of 18 Dec 2018 Source: THE SILICOIN

4.3. Detailed methodology

We encourage you to be patient and invest your time in reviewing detailed methodology.

Step 1:

Retrieve raw data from exchanges

From each exchange receive last price and 24-hour trading volume for available trading pairs (i.e. BTC/USD, XRP/ETH, etc.).

Step 2:

Find USD price and trading volume of coins on each exchange

Procedure of finding USD price of a coin on an exchange depends on whether this exchange offers trading to fiat or not.

This part is important, please pay attention to it. We provide calculation example below for more clarity.

2.1. If liquid fiat trading is available on the exchange — price all coins directly

2.1.1. Find USD prices of coins traded directly to fiat. Fiat currencies other than USD are converted to USD using Open Exchange Rates data

In case there are several trading pairs that can establish coin’s USD price in one step, pair with the highest trading volume is selected. See example below

2.1.2. Find USD prices of coins traded to coins priced at a previous step but not traded to fiat

2.1.3. Such step by step iterations continue until all coins on the exchange receive USD price

2.2. If no liquid trading to fiat on the exchange is available use base coin

2.2.1. We use USD price of an applicable base coin to find USD prices of other coins on this exchange

Base coins are top-tier coins that are extensively traded to fiat on other exchanges. Base coins are used to find USD price on exchanges with no fiat trading — base coins serve instead of fiat currency in this case.

Base coin is selected for each exchange separately — coin with the largest trading volume is selected. So far Bitcoin offers maximum trading volume on all exchanges with no fiat trading that are integrated by THE SILICOIN.

For example, Bitcoin serves as base coin on Binance.

Only one base coin may be selected for an exchange to assure price homogeneity as explained in Section 2 of this article.

2.2.1. Same algorithm as described in 2.1. is applied to find most efficient path to base coin for other coins

2.3. Calculate total trading volume for each coin on the selected exchange

Calculation example for Step 2 of THE SILICOIN methodology

Assume that exchange “XYZ” has the following trading pairs:

Trading data retrieved from imaginary exchange “XYZ”. USD/EUR retrieved from Open Exchange Rates


1. Price coins traded directly to fiat (item 2.1.1. of detailed methodology)There are 3 pairs to fiat: BTC/USD, BTC/EUR, ETH/USD. Therefore, we can price BTC and ETH at this step

For BTC we have 2 pairs to fiat. Dollar volume of BTC/USD=$15,000,000; of BTC/EUR=10,000,000/0.88=$11,363,363. We select the pair with higher volume → BTC/USD. Last price of BTC/USD=$3,200 → that is BTC price on this exchange

For ETH we have only 1 pair to fiat. Therefore, ETH costs $85 on this exchange

We explain why we use the most liquid pair approach instead of averaging in Section 4.4 below.

2. Price coins that are not traded to fiat but traded to coins priced previously (item 2.1.2.)

For LTC we have 2 applicable pairs: LTC/BTC and LTC/ETH. Dollar volume of LTC/BTC=160*$3,200=$512,000, of LTC/ETH=3,500*$85=$297,500. We use pair with higher volume. Therefore, LTC costs 0.008*$3,200=$25.6

3. Calculate total USD trading volume for all coins (item 2.3.)

  • For BTC we have 3 pairs with following volumes: BTC/USD=$15mn, BTC/EUR=$11.36mn, LTC/BTC=$0.512mn. Total BTC volume is $26,872mn
  • For ETH we have 2 pairs with following volumes: ETH/USD=$6mn, LTC/ETH=$0.2975mn. Total ETH volume is $6.2975mn
  • For LTC we have two pairs with following volumes: LTC/BTC=$0.512mn, LTC/ETH=$0.2975mn. Total LTC volume is $0.8095mn

Illustration for calculation example:

Illustration for calculation example

Step 3:

Calculate market average price of coins

We combine data from all exchanges to find market average price of coins. Exchanges may be excluded from calculating average price for some coins under rules laid out below.

3.1. Market average price of a coin calculated as volume weighted average price (VWAP) across exchanges integrated by THE SILICOIN and where this coin is traded

The VWAP formula:








Where:P_i_ — price of a coin on exchange i as determined in Step 2, V_i_ — total trading volume of a coin on exchange i as determined in Step 2, WA_i_ — weight of an exchange i after adjustments, adj_i_ — adjustments applied to raw weight of exchange i,W_i_ — raw weight of an exchange i,**Sum(V_i_)** — total trading volume of a coin across all integrated exchanges that are not excluded. In other words, it is sum of individual V_i_Exchange i — exchange integrated by THE SILICOIN and not excluded from calculation of totals for selected coin

3.2. Exchange may be partially or fully excluded from calculation of VWAP and Sum(Vi) of a coin under following conditions

3.2.1. Automated full exclusion from VWAP if coin serves as a base coin on this exchange (WA_i_ in the formula above is set to 0)

3.2.2. Automated full exclusion from both VWAP and Sum(V_i_) if 24-hour trading volume increased over 300% day-over-day and exceeds $100k (exchange is excluded from Sum(V_i_), therefore W_i_ and WA_i_ are set to 0)

3.2.3. Automated partial or full exclusion from VWAP if price of a coin on an exchange has large deviation from VWAP


_In this case we adjust weight of an exchange (W_i in the formula above). Adjustment applies linearly to price deviations from 50% to 100%. For price deviation of 80% WA_i=W_i*(1-(80%-50%)/(100%-50%))=W_i*0.6. For price deviations above 100% WA_i=0, for price deviations below 50% WA_i=W_i

3.2.4. Manual full exclusion. Automated exclusion algorithms are set to cut off extreme deviations. Smaller deviations may be detected manually and exchanges may be excluded from VWAP or from both VWAP and Sum(V_i_) for some coins

Note that WA_i_ are always rescaled to sum up to 100%


**Note on 3.2.2.**This mechanism protects from exchanges sending us erroneous volume and distorting the price. $100k threshold is set in order not to cut off coins with low trading volume that tend to exhibit high volume volatility.


**Note on 3.2.3.**We first calculate VWAP using W_i_. Then we compare prices on each exchange with VWAP and implement adjustments to W_i_ according to the formula. Next we recalculate VWAP using WA_i_. Exchanges that dominate by trading volume are unlikely to be penalized. This procedure occurs only once.

4.4. Why use the most liquid trading pair to calculate coin price on an exchange

As described in Step 2 of the methodology we use trading pair with the highest volume to determine price of a coin on an exchange at each step.

Alternatively, we can use average price across all pairs available in one step (both BTC/USD and BTC/EUR in the example above).

Below we explain why using the most liquid pair is better than averaging.

Price of a coin within one exchange may vary across trading pairs for the following reasons:

  1. Bid-ask spread
  2. Transaction fees (Arbitrageurs may be unable to eliminate price deviations because of fees. Such deviations move stochastically and should have mathematical expectation of zero)
  3. No trading

These inefficiencies obscure the true price, i.e. the price at which users can transact on the selected exchange.

The most liquid pair provides more precise price than averaging:


1. Top volume pair has the narrowest bid-ask spread By accounting for lower volume pairs we will widen the spread

We use the last price for calculations — it may be on both buy or sell side of the spread. Averaging may theoretically yield the mid price. However, during buy or sell markets, last price for each pair will be on the same side of the bid-ask interval most of the time. Therefore, averaging will widen the spread


2. Lower volume pairs are more likely to exhibit inefficiencies On a lower volume pairs, off-market price may hold for some time. Assume arbitrageurs see mispricing on low volume pair and try to eliminate it, but no one is buying/selling on the other side. The highest volume pair is less prone to such distortions

3. It is just more straightforward and people are more likely to be able to transact at prices of the most liquid pairs

4.5. How to observe all inputs used for calculation of average prices

  1. Contribution of each exchange to average price and total trading volume of a coin:
  • You would find exchange weights for each coin in “Exchanges” section of the dedicated page of selected coin. For example, Bitcoin page or Ethereum page
  • We provide W_i_ from the VWAP formula in “Market share” column of the table and WA_i_ in “Adjusted share” column
  • If Market share = 0, then exchange is excluded from both VWAP and Sum(V_i_) calculation for selected coin (footnotes 3 or 5, details below)
  • If Adjusted share = 0, then exchange is included in Sum(V_i_) but excluded from VWAP for selected coin (footnotes 1, 2 or 4)

2. Adjustments and outlier cleaning mechanisms applied to exchanges:

  • If any conditions described in Section 3.2. of the methodology apply to an exchange, a footnote will appear near the name of this exchange in the “Exchanges” section of the coin page. Below is the legend:






No footnote — fully included in calculations of aggregates1 — Base coin (see item 3.2.1. of the methodology)2 — Price outlier (see 3.2.3.)3 — Volume outlier (see 3.2.2.)4 — Manually excluded from average price (see 3.2.4.)5 — Manually excluded from average price and total volume (see 3.2.4.)

3. What trading pair was used to establish USD price of a coin on each exchange

  • In “Exchanges” section of any coin page you can find “Pricing pair” column that contains trading pairs used to establish USD price of the selected coin on each exchange and a link to historical trading data of this pair
  • Base coin in Pricing pair column means that selected coin was used as a base coin on this exchange. Adjusted share will be automatically set to 0% as described in item 3.2.1 of the methodology. You can observe it on Bitcoin page

Did we miss something you’d like to know?

5. Conclusion

  • Most if not all crypto price aggregators use incorrect and nontransparent price calculation methodology
  • We propose a step by step methodology that has no grey areas and yields reliable average prices
  • All inputs and outputs used in our methodology are readily available on THE SILICOIN

With this article we aim to increase transparency of the crypto space.

It is the first article in the series devoted to financials of cryptocurrency ecosystem. Parts 2 and 3 will follow.

We’d like to reiterate the advice we gave in our post about cryptocurrency news sources — be critical about all information you consume.


Open discussion is the best way to achieve consensus. Please address your questions and express your thoughts in the comments below.


Transparently yours,THE SILICOIN team