CloudFlare's customers: Same old power law
This just in: Little guy gets screwed !
August 11, 2012
Ten years ago, the tubes were all atwitter about
traffic and the power law. Google's PageRank from 0 to 10, displayed
on the Google toolbar, used a log base that averaged between 4 and 6.
This meant that you needed about five times the number of links to
increase your PageRank by one digit, if everything else was equal. The
power law is not a
new idea it has been observed in various natural phenomena. Look
at income distribution in dog-eat-dog capitalism, for example. The Occupy
Wall Street demonstrators are not exaggerating when they compare the one
percent to the 99 percent.
Nevertheless, it was new in cyberspace.
Long Tail was hyped by the chief editor of Wired. He seemed to
argue that you should be delighted that you cannot sell anything from your
website due to low traffic. You are the wave of the future, so let's see a smile!
Along with Google's PageRank, the traffic-tracking site Alexa also used
a log scale. High traffic domains get ranked with lower numbers, so
that the number one domain on the web has the highest traffic. Years ago,
reports on actual traffic from various webmasters resulted in a plot of
Alexa rank against daily unique visitors (Alexa itself did not offer this
information). The formula for the curve that best fit these plots took
Alexa's traffic rank number, raised it to the power of -0.732, and
multiplied the result by 7 million. Even though this data is old, it
mainly affects the multiplier, and the nasty curve remains the same. This
graph is plotted on a linear scale. If you were to re-plot it using an
optimized log base, it would be a straight line from upper left to lower right.
Here is a current graph from Alexa that compares cloudflare.com traffic
to whitehouse.gov traffic. Note that the 'Y' axis is plotted on a logarithmic
scale. You have to do this when plotting power-law phenomena in order to
make the graph readable, as a linear plot would squash much of the data to
one edge of the graph. One must keep in mind that it is not your usual
graph. For the casual reader, it presents a massive distortion.
Over the past two months, this site has collected 130,000 domain names that
have used CloudFlare at some point since late 2010. Within these, we found
86,846 with a direct-connect IP address. The numbers change weekly as our
database expands. This page will not update, however. Even when the
multiplier changes, with a large initial sample the curve will look the
same. If and when CloudFlare begins to enforce its terms of service or
adopts new admission policies, then it is worth another look. CloudFlare
itself claims nearly 500,000 domains, which means that our current sample
The idea to create this page came on August 6. I noticed from our daily
haul of domains that showed DNS activity on CloudFlare's servers, that 140
different domain names on that single day had the same direct-connect IP
address (220.127.116.11). This IP geolocates to Thailand. I recalled from
the graph in Prolexic Quarterly Global DDoS Attack Report - Q2 2012,
that Thailand has recently shown an amazing increase in activity. I also
knew from our own websites, which continue to deal with bots (about 35,000
per day) inherited from the days when Scroogle was functioning, that
lately the IP addresses from Thailand were showing the largest daily
numbers much larger than China! We finally blocked the entire country.
My first instinct was to look more closely at that one IP address. Then
I decided that this is not my problem. (It ought to be something that
CloudFlare does whenever customer activity exceeds certain parameters.) The
purpose of this website is to look at CloudFlare as a whole. That's when
I thought to look at the distribution of direct-connect IP addresses
across all 86,846 unique domains. Does the power law operate on
CloudFlare's customers? I thought it might, knowing that the hype was
for free DDoS protection, and that they accepted all comers.
Our homebrew software ranked the 37,017 unique IPs attached to the 86,846
unique domain names. Here is the top of the new list. The number in front
is the number of domains controlled by that IP:
1079 18.104.22.168 USA
You can see that the top 0.054 percent of the unique IPs control 6,485
domains, which is 7.5 percent of all the domains. Here are the stats
at the long-tail end of CloudFlare:
633 22.214.171.124 USA
621 126.96.36.199 USA
603 188.8.131.52 VIRGIN ISLANDS, BRITISH
600 184.108.40.206 BAHAMAS
357 220.127.116.11 USA
339 18.104.22.168 USA
248 22.214.171.124 USA
247 126.96.36.199 USA
228 188.8.131.52 USA
199 184.108.40.206 THAILAND
184 220.127.116.11 USA
183 18.104.22.168 CANADA
161 22.214.171.124 USA
156 126.96.36.199 USA
139 188.8.131.52 USA
137 184.108.40.206 USA
132 220.127.116.11 USA
124 18.104.22.168 USA
115 22.214.171.124 UNITED KINGDOM
Domains per IP in CloudFlare's long tail
1 domain per IP = 69 percent of IPs and 29 percent of domains
2 or more = 31 percent of IPs and 71 percent of domains
3 or more = 17 percent of IPs and 59 percent of domains
4 or more = 11 percent of IPs and 51 percent of domains
5 or more = 8 percent of IPs and 46 percent of domains
Yes, this is the power law. Does it matter? It wouldn't be important if
CloudFlare wasn't hyped to the hilt by Matthew Prince and friends. The
curve would tend to flatten out if CloudFlare regulated new admissions,
and especially if that meant charging a monthly rate for every domain.
Instead, it's a free service. This attracts domain squatters and affiliate
farmers on one end, and bloggers with cat pictures on the other end. Those
aren't necessarily bad, but it also seems that each end sports a high
incidence of riffraff and criminal activity. Good luck if you try to
complain to CloudFlare. They will tell you that they are not the hosting
That's not quite true, and I doubt that a judge or jury would agree with
CloudFlare. They host the DNS and they cache content. If they yanked the
DNS on a domain the traffic would drop to almost zero within a few
minutes. That would at least get the owner's attention.
Mr. Prince is much too tolerant of abuse, and it's the little guy who
gets screwed, whether he uses CloudFlare or not.
If you want to see the domains and IP addresses, our list is available at the
bottom of this page.
A zip file there can be downloaded and used with grep to research specific IPs or netblocks.