Since Iām going to be writing a whole bunch of articles about CDN speeds, I wanna consolidate my testing methodology into a single article. Iām going to be linking back to this article a lot.
Questions to be answered in this article:
- What numbers are measured for CDN download speed?
- Why were those numbers selected?
- How are the numbers measured?
- Where is the measurement done? Cloud server or home computer?
The three important measurements to make
At the end of the day, the CDN is really just a caching layer. All it is is just a massively distributed caching layer. Thatās it. Thatās all it is.
And if youāre familiar at all with how caching layers work, there are 3 all-important metrics to measure:
- Cold cache latencyāāāitem is not in cache at all
- Hot cache latencyāāāitem is in fastest response layer of cache
- Warm cache latencyāāāitem is at a slower response layer of cache.
To illustrate this in hardware terms: the server responding from RAM is the fastest possible latency. Thatās a hot cache. Getting it from disk (SSD) is the second-fastest possible latency. And you can add more layers (HDD). Finally, the cold cache is if the server doesnāt have the item at all.
With regards to a CDN, how do we measure hot, cold, and warm cache?
Cold cache latency: You just fetch a file that the CDN has never seen before. This is going to take the longest time. On CloudFront, for example, Iām seeing a drop from 1.2 seconds to 0.232 seconds in the hot cache.
Hot cache latency: You can fetch the same file 10 times in a row and take the average.
Warm cache latency: Trickier. Much trickier. We must wait some amount of time to wait for the file to leave the hot cache. For now, we are sticking with 30 minutes, but we may adjust this as we learn how CDNs are moving items from hot cache to warm cache. The result of the warm cache is also much more ambiguous as it must be fetched at some time away from the cold and hot cache tests.
You can see the cold cache is much slower than the hot cache by running this curl:
curl -w ā@curl-format.txtā -o tmp -s ā [https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024Ć576.png](https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024x576.png) ā
This example uses CloudFront. There is a massive jump from 1+ seconds to 0.1ā0.2 seconds.
How to measure the 3 measurements: my open-sourced Python script (check GitHub)
I used a python script to automate the measurements. This is important so that the measurements are very consistent between CDNs. Also important to remove human error.
It is open-sourced here: https://github.com/speedtestdemon/speed-tests/blob/master/test.py.
There is a massive drop in CDN download time from the cold cache to the hot cache. CloudFront gave me 1.09 seconds for cold cache, then 0.112 seconds for hot cache, then 0.226 for the warm cache.
Here is the full python output. The cool thing is that it prints out the headers it got for the cold cache and the warm cache. The headers are used to sanity check that the test is doing the correct thing regarding cold cache and warm cache test.
------------------------------------------------------------- Testing " **Cold cache speed**" ------------------------------------------------------------- Got headers: HTTP/2 200 **content-type** : image/png **content-length** : 719983 **date** : Sun, 20 Jun 2021 21:47:06 GMT **last-modified** : Mon, 07 Jun 2021 00:16:21 GMT **etag** : "52ae2ff2354d4a68e680b77b4da58985" **accept-ranges** : bytes **server** : AmazonS3 **x-cache** : Miss from cloudfront **via** : 1.1 c39432c353feb02b03735f3850e19107.cloudfront.net (CloudFront) **x-amz-cf-pop** : IAH50-C1 **x-amz-cf-id** : NgkCqcrwb3K65LeGu7uhebFNODrNI9s8wVeHZ93lq2XKrE3q9PMm-A== **time\_namelookup** : 0.06327399999999999691 **time\_connect** : 0.01762099999999999778 **time\_appconnect** : 0.05910300000000001663 **time\_pretransfer** : 0.00010199999999999099 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.95064099999999995827 **time to download** : 0.00008300000000005525 **time\_total** : 1.09082400000000001583 ------------------------------------------------------------- Testing " **Hot cache speed**" ------------------------------------------------------------- 10 requests done. Average: **time\_namelookup** : 0.00190460000000000000 **time\_connect** : 0.02373660000000000006 **time\_appconnect** : 0.06306589999999999419 **time\_pretransfer** : 0.00019590000000000024 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.02317680000000000434 **time to download** : 0.00007860000000000089 **time\_total** : 0.11215840000000001919 ------------------------------------------------------------- Testing " **Warm cache speed**" ------------------------------------------------------------- Sleeping for 0.5 hr to move cache from hot to warm Got headers: HTTP/2 200 **content-type** : image/png **content-length** : 719983 **last-modified** : Mon, 07 Jun 2021 00:16:21 GMT a **ccept-ranges** : bytes **server** : AmazonS3 **date** : Mon, 21 Jun 2021 00:09:46 GMT **etag** : "52ae2ff2354d4a68e680b77b4da58985" **x-cache** : Hit from cloudfront **via** : 1.1 9b59bfec44582f64d3d8dac9fb7d27b7.cloudfront.net (CloudFront) **x-amz-cf-pop** : DFW50-C1 **x-amz-cf-id** : hzwVoHfaHen2TR3cCNRsnwniXMc3_BaOWk7oa2DiQaWkioXqwSGRrg== **time\_namelookup** : 0.11359600000000000253 **time\_connect** : 0.01640300000000000091 **time\_appconnect** : 0.07292699999999999183 **time\_pretransfer** : 0.00030900000000000372 **time\_redirect** : 0.00000000000000000000 **time\_starttransfer** : 0.02220300000000000051 **time to download** : 0.00011099999999999999 **time\_total** : 0.22554899999999999949
Please note these metrics like time_namelookup do correspond to the same meaning curl's time_namelookup. 'Curl' shows a cumulative time measure, so it is always increasing. However, I want to look at the time each stage took separately from each other, so curl's cumulative, the increasing timestamp was not helpful.
Common pitfalls of speed testing CDNs
These are some mistakes I made while speed-testing CDNs. These are easy mistakes to make, so I decided to write about them.
Mistake #1: cold cache tests should be done where the CDN does not have the file cached.
Why itās an easy mistake to make: Most people think files are cached for only 24 hours. Thatās not necessarily true. Some CDNs like Jetpack CDN evidently cache it for more than 1 day (based on my tests). Additionally, most peopleās curl calls do not include the headers warning sign that the file could be cached.
How to fix: Check the headers returned. You should see the keyword āMISSā somewhere in there. If you see a āHIT,ā thatās a warning sign. This is also why the python script prints out the headers for the cold cache test and warm cache test.
Mistake #2: you need to actually download the file.
Why itās an easy mistake to make: for some reason, it is extremely easy to accidentally make curl skip the download of the actual file. There are at least 3 ways of doing this.
- If you set the āNOBODYā option to curl via pycurl, it does not download the file. I made this mistake in my python script.
- If you set the ā-O /dev/nullā flag via the curl in the command line, it does not download the file. You think it would just download the file and dump it to /dev/null. No. curl is āsmarterā than that and just skips the download altogether.
- If you set the ā-Iā flag via curl in the command line, it does not download the file.
Why this is a serious mistake to avoid: CloudFront does not cache the file if you do not download the file! And since CloudFront is an industry-standard CDN, it is likely other CDNs have the same behavior.
How you can detect this mistake: Look at the download time (time_totalāāātime_starttransfer). If it is in the hundreds of microseconds, thatās a problem. The speed of light is only 0.184 miles/microsecond. If you got only 200 microseconds, then that is 37 milesā¦roundtrip. Or 18 miles one way. It is highly unlikely thereās a data center āthat closeā to where you are. It means that there was no network transfer.
How to fix: In pycurl, write the downloaded contents to a BytesIO or StringsIO variable. To see how itās done, see my python CDN speed test script. Or if youāre using curl from the command line, avoid ā-O /dev/nullā and ā-Iā flag and make sure download time is at least several milliseconds.
Mistake #3: The input URL is invalid.
Why itās an easy mistake to make: Your curl speed test doesnāt report any 404 error. So you get back the speed test results without realizing thereās an error.
How to catch mistakes: One red flag you wanna look at is the ādownloadā time. If itās less than 1 millisecond, thatās way too fast. Do the math. The speed of light is .184 per microsecond. There are 1000 microseconds in a millisecond. If youāre getting, letās say 200 microseconds, thatās only 36 milesā¦. roundtrip. One way, itās only 18 miles. So 200 microseconds is way too fast and indicates thereās probably no network transfer (i.e., no download) happening at all.
Of course, the other thing you could do is actually download the fileā¦use wget. Youāll see an error message pretty easily.
How to fix the mistake: Use the right URL.
Mistake #4: The input URL is HTTP, not HTTPS.
Why itās an easy mistake to make: People are careless or perhaps not technical enough. This is actually an important thing to ensure, as Iām noticing the SSL exchange normally takes 50 milliseconds. Given a fast curl time is only around 120 milliseconds (from a home computer, not cloud provider).
How to fix the mistake: Be consistent in your testing. Use HTTPS for all or HTTP for all. And you should probably use HTTPS since thatās the standard (everybody uses HTTPS).
Why I chose not to test on a normal web server like DigitalOcean
It turns out itās not a good idea to use DigitalOcean to run these CDN speed tests.
I initially thought it would be a good idea to run these CDN speed tests on a standardized server location like DigitalOcean. Since I travel a lot, I canāt guarantee my internet will always be the same quality, or the location could be drastically different, such as different countries.
So thatās what I did. I tried it outā¦and noticed DigitalOceanās internet speed is fastāāāREALLY fast. Like, way faster than my home internet speed. I was a little shocked. I didnāt think download speeds could ever get that fast. It makes me wonder what I need to do to make my home internet as good as that because I really thought I had the best possible internet that money could pay for (I basically use my computer 100 hours a week, OK. Itās important to me).
If you wanna see an example of the drastic difference in CDN speedā¦well, hereās one. Itās using the CloudFront example that weāre using in the āFree CDNā blog series. When I ran my speed test python script with:
python3 test.py [https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024Ć576.png](https://d3va53q3li7xt1.cloudfront.net/wp-content/uploads/2021/05/shoeb-1024x576.png)
I got back a download time of a blazing fast 0.0225 seconds or 23 milliseconds. Holy cow. And then doing the same thing on my home internet, I only get 0.2315 seconds or 232 milliseconds. What the hell.
Out of curiosity, I ran an internet speed test on the DigitalOcean server. Like how the hell is it that much faster. Thatās gotta get anybodyās interest up. And would you believe the numbers???
# speedtest-cli Retrieving speedtest.net configuration... Testing from DigitalOcean (128.199.187.118)... Retrieving speedtest.net server list... Selecting best server based on ping... Hosted by NewMedia Express (Singapore) [13.17 km]: 2.16 ms Testing download speed................................................................................ Download: 1740.52 Mbit/s Testing upload speed...................................................................................................... Upload: 1420.29 Mbit/s
LOL, holy crap, Download speeds of 1740.52 Mbit/s and Upload speeds of 1420.29 Mbit/s? How much money did DigitalOcean pay for that? I wanna buy it.
That is literally 10x as fast as my internet, which tops out at 160 Mbit/s (using fast.com).
So there are several reasons why I decided not to run these speed tests on DigitalOcean after all:
- Itās hard to tell the difference between good or bad CDNs.
- Itās not a real-world usage of how CDNs actually get used. Almost nobody will have internet speeds that fast, where itās nearly 10x faster than the most premium internet you can get.
BTW - a relevant article that also talks about the distorted internet speeds of cloud providers: how accurate are CDNPerfās numbers? Answer: not very accurate. TODO: I will link to this essay (rant) once Iāve written it!
Why do the CDN speed tests need to be done almost at the same time
This is probably obvious to most internet users, but because Iām trying to measure CDNs as methodically as possible, I might explain it briefly.
Internet speeds usually vary greatly depending on the time of the day. For example, during the evenings, there is typically massive internet congestion as people get off work and get leisure time. Video bandwidth is particularly notorious for hogging a lot of bandwidth. For example, did you know Netflix alone hogs 40% of the internet traffic during the evenings? Google it. And thatās just Netflix. Imagine if you added Youtube too. The evenings just get really congested with internet traffic.
And since we are focused on CDNs as the variable, we should make internet conditions invariant. This means running the CDN speed test for 2 CDNs at basically the same time (letās say within 1 minute is fine).
Summary (TL;DR)
- I measure three CDN download speeds: cold, hot, and warm cache.
- The measurement is automated with a python script (itās open-sourced, see it here: https://github.com/speedtestdemon/speed-tests/blob/master/test.py)
- CDN speed measurements on 2 or more CDNs are to be conducted within the space of a few minutes.
- CDN speed measurement is conducted on a home computer with home internet, not on the cloud where internet speeds are 10x home internet speeds (easily).
Questions? You can ask me on social media https://twitter.com/SpeedTestDemon
Also published on https://speedtestdemon.com/testing-methodology/.