Those Damned Statistics

thCAVW45NPOne of my biggest pet peeves in life is the misuse of statistics. I am a math guy and I sometimes tackle math problems just for the fun of it. I understand statistics pretty well and my firm performs surveys. I think I disappoint a lot of my clients when I try to stop them from interpreting the results in a survey to prove something that the responses really don’t prove. Surveys are a really useful tool, but too often I see the survey results used to support untruthful conclusions.

A week ago the NTIA (National Telecommunications and Information Administration) released their latest poll looking at broadband usage in the US. The survey asked a lot of good questions and some of the results are very useful. For example, they show that overall broadband penetration in the US is up to 72% of households. But even that statistic is suspect, as I will discuss below.

The problem with this survey is that they didn’t ask the right questions, and this largely invalidates the results. The emphasis of this particular survey was to look at how people use cellphones for data access. And so they asked questions such as asking the various activities that people now use their phone for such as browsing the web or emails. And as one would expect, more people are using their cellphones for data, largely due to the widespread introduction of smartphones over the last few years.

There is nothing specific with any of the individual results. For example, the report notes that 42% of phone users browse the web on their phone compared to 33% in 2011. I have no doubt that this is true. It’s not the individual statistics that are a problem, but rather the way the statistics were used to reach conclusions. In reading this report one gets the impression that cellphone data usage is just another form of broadband and that using your cellphone to browse the web is more or less the same as browsing off a wired broadband connection.

The worst example of this is in the main summary where the NTIA concluded that “broadband, whether fixed or mobile, is now available to almost 99% of the U.S. population”. This implies that broadband is everywhere and with that statement the NTIA is basically patting themselves on the back for a job well done. But it’s a load of bosh and I expect better from government reports.

As I said, the main problem with this report is that they didn’t ask the right questions, and so the responses can’t be trusted. Consider data usage on cellphones. In the first paragraph of the report they conclude that the data usage on cellphones has increased exponentially and is now deeply ingrained in the American way of life. The problem I have with this conclusion is that they are implying that cellphone data usage is the same as the use of landline data – and it is not. The vast majority of cell phone data is consumed on WiFi networks at work, home or at public hot spots. And yes, people are using their cellphones to browse the web and read email, but most of this usage is carried on a landline connection and the smartphone is just the screen of choice.

Cellular data usage is not growing exponentially, or maybe just barely so. Sandvine measures data usage at all of the major Internet POPs and they show that cellular data is growing at about 20% year, or doubling every five years, while landline data usage is doubling every three years. I trust the Sandvine data because they look at all of the usage that comes through the Internet and not just at a small sample. The cell carriers have trained us well to go find WiFi. Sandvine shows that on average that a landline connection today uses almost 100 times more data than a cellphone connection. This alone proves that cellphones are no substitute for a landline.

I have the same problems with the report when it quantifies the percentage of households on landline broadband. The report assumes that if somebody has a cable modem or DSL that they have broadband and we know for large parts of the country that having a connection is not the same thing as having broadband. They consider somebody on dial-up to not be broadband, but when they say that 72% of households have landline broadband, what they really mean is that 72% of homes have a connection that is faster than dial-up.

I just got a call yesterday from a man on the eastern shore of Maryland. He live a few miles outside of a town and he has a 1 Mbps DSL connection. The people a little further out than him have even slower DSL or can only get dial-up or satellite. I get these kinds of calls all of the time from people wanting to know what they can do to get better broadband in their community.

I would challenge the NTIA to go to rural America and talk to people rather than stretching the results of a survey to mean more than it does. I would like them to tell the farmer that is trying to run a large business with only cellphone data that he has broadband. I would like them to tell the man on the eastern shore of Maryland that he and his neighbors have broadband. And I would like them to tell all of the people who are about to lose their copper lines that cellular data is the same as broadband. Because in this report that is what they have told all of us.

Should You Be Peering?

Google 貼牌冰箱(Google Refrigerator)

Google 貼牌冰箱(Google Refrigerator) (Photo credit: Aray Chen)

No, this is not an invitation for you to become peeping toms, dear readers. By peering I am talking about the process of trading Internet traffic directly with other networks to avoid paying to transport all of your Internet traffic to the major Internet POPs.

Peering didn’t always make a lot of sense, but there has been a major consolidation of web traffic to a few major players that has changed the game. In 2004 there were no major players on the web and internet traffic was distributed among tens of thousands of websites. By 2007 about 15,000 networks accounted for about half of all of the traffic on the Internet. But by 2009 Google took off and it was estimated that they accounted for about 6% of the web that year.

And Google has continued to grow. There were a number of industry experts that estimated at the beginning of this year that Google carried 25% to 30% of all of the traffic on the web. But on August 16 Google went down for about 5 minutes and we got a look at the real picture. A company called GoSquared Engineering tracks traffic on the web worldwide and when Google went down they saw an instant 40% drop in overall web traffic as evidenced by this graph: Google’s downtime caused a 40% drop in global traffic

And so, when Google went dead for a few minutes, they seem to have been carrying about 40% of the web traffic at the time. Of course, the percentage carried by Google varies by country and by time of day. For example, in the US a company called Sandvine that sells Internet tracking systems, estimates that NetFlix uses about 1/3 of the US Internet bandwidth between 9 P.M. and midnight in each time zone.

Regardless of the exact percentages, it is clear that a few networks have grabbed enormous amounts of web traffic. And this leads me to ask my clients if they should be peering? Should they be trying to hand traffic directly to Google, NetFlix or others to save money?

Most carriers have two major cost components to deliver their Internet traffic – transport and Internet port charges. Transport is just that, a fee that if often mileage based that pays for getting across somebody else’s fiber network to get to the Internet. The port charges are the fees that are charged at the Internet POP to deliver traffic into and out of the Internet. For smaller ISPs these two costs might be blended together in the price you pay to connect to the Internet. So the answer to the question is, anything that can produce a net lowering of one or both  of these charges is worth considering.

Following is a short list of ways that I see clients take advantage of peering arrangements to save money:

  • Peer to Yourself. This is almost too simple to mention, but not everybody does this. You should not be paying to send traffic to the Internet that goes between two of your own customers. This is sometimes a fairly significant amount of traffic, particularly if you are carrying a lot of gaming or have large businesses with multiple branches in your community.
  • Peer With Neighbors. It also makes sense sometime to peer with neighbors. These would be your competitors or somebody else who operates a large network in your community like a university. Again, there is often a lot of traffic generated locally because of local commerce. And the amount of traffic between students and a university can be significant.
  • Peering with the Big Data Users. And finally is the question of whether you should try to peer with Google, Netflix or other large users you can identify. There are several ways to peer with these types of companies:
    • Find a POP they are at. You might be able to find a Google POP or a data center somewhere that is closer than your Internet POP. You have to do the math to see if buying transport to Google or somebody else costs less than sending it on the usual path.
    • Peer at the Internet POP. The other way to peer is to go ahead and carry the traffic to the Internet POP, but once there, split your traffic and take traffic to somebody like Google directly to them rather than pay to send it through the Internet port. If Google is really 40% of your traffic, then this would reduce your port charges by as much as 40% and that would be offset by whatever charges there are to split and route the traffic to Google at the POP.

I don’t think you have to be a giant ISP any more to take advantage of peering. Certainly make sure you are peeling off traffic between your own customers and investigate local peering if you have a significant amount of local traffic. It just takes some investigation to see if you can do the more formal peering with companies like Google. It’s going to be mostly a matter of math if peering will save you money, but I know of a number of carriers who are making peering work to their advantage. So do the math.