I know that when the public hears that their ISP is engaging in data mining that they assume this means that the ISP is reading their emails and monitoring their website viewing. And ISPs do have the ability to do those things although I don’t know any who spy on their customers in that way.
I can certainly understand why data mining scares the average consumer. Supermarkets get you to sign up for their loyalty programs so that they know everything you buy from them. And I know I get a spooky feeling when I express an interest about some product in one place on the Internet and then see ads for that product pop up on Facebook or my Google search.
But data mining is a valuable tool and every ISP should be using it – just not in the same way that the supermarkets and Facebook do it. In fact, we probably need to come up with a better terminology for doing the things I am suggesting below.
There are a number of tools around that let you look at data about customer usage and these tools allow an ISP to do the following:
- Spambots. There is a wide array of spambots and other malware on the web that can infect customers’ computers. The worst of these, from a network perspective are spambots, which take over your customer’s computers and use it to send out spam. Most ISPs monitor email usage from their own domain and can spot when one of their users has been taken over by a spambot. But most customers these days do not use the email names and domains assigned by their ISP. Instead they web email addresses such as gmail or even the older AOL. And some spambots create new email addresses that the customer doesn’t even know about. And so data mining can be used to look for customers with unusual upload traffic. No customer is going to be offended if you ask them if they are uploading traffic 24 hours per day if in the process you help eliminate Trojan horses and spambots from their computer.
- Web servers. Most ISPs do not want a customer to be using a residential ISP account to run a commercial web server. A web server is a device that is being used to run a website or service that drives a large amount of download traffic. Such a website might be used for e-commerce for example. But far too often web servers are used to run porn sites. ISPs are not against web servers, but they do expect people who operate them to buy the proper business level service. A web server can be full 24-hours per day, and that is generally not the level of service that is intended for a shared residential product. Data mining can be used to identify web servers and the customer can be directed to a more appropriate (and appropriately priced) service.
- Data Caps. Most ISPs have set some cap on the amount of usage that a customer can download in a month. And these caps do not have to be small. I have one client that has a 2 terabyte cap each month for residential downloads. But there is no sense in having a data cap if you can’t actually measure how much bandwidth each customer is using. Data mining tools are the way to measure customers’ usage.
- File sharing. Most ISPs have terms of service that prohibit customers from sharing copyrighted materials with others. But realistically an ISP is not going to know what customers are sharing with each other unless you get a complaint from a copyright holder. But many ISPs still like to get a handle on file-sharing because such traffic can eat up a lot of system bandwidth. Data mining can help you identify customers who are probably involved in one of the common file sharing programs. An awful lot of file sharing is done by teenagers. I have clients who send out friendly reminders to customers who they think are file sharing that say something like: “We notice by your internet usage that you are probably running a file sharing program. We would just like to remind you that it is illegal to share copyrighted material and that there have been cases where copyright owners have gotten significant settlements by suing people who were sharing their property.” Such notices cut down on a lot of file sharing traffic as parent pressure kids into doing the right thing.
So you should be data mining. But perhaps the things I have described could all better be classified as network management, a term that would not dismay your customers.