Is the Universal Translator Right Around the Corner?

star trek comm badgeWe all love a race. There is something about seeing somebody strive to win that gets our blood stirring. But there is one big race going on now that it’s likely you never heard of, which is the race to develop deep learning.

Deep learning is a specialized field of Artificial Intelligence research that looks to teach computers to learn by structuring them to mimic the neurons in the neocortex, that portion of our brain that does all of the thinking. The field has been around for decades, with limited success, and has needed faster computers to make any real headway.

The race is between a few firms that are working to be the best in the field. Microsoft and Google have gone back and forth with public announcements of breakthroughs, while other companies like Facebook and China’s Baidu are keeping their results quieter. It’s definitely a race, because breakthroughs are always compared to the other competitors.

The current public race deals with pattern recognition. The various teams are trying to get a computer to identify various objects in a defined data set of millions of pictures. In September Google announced that it had the best results on this test and just this month Microsoft said their computers beat not only Google, but did better than what people can do on the test.

All of the companies involved readily admit that their results are still far below what a human can do naturally in the real world, but they have made huge strides. One of the best known demonstrations was done last summer by Google who had their computer look at over 10 million YouTube videos and asked it to identify cats. Their computer did twice as good as any previous test, which was particularly impressive since the Google team had not pre-defined what a cat was to the computer ahead of time.

There are some deep learning techniques in IBM’s Watson computer that beat the best champs in Jeopardy. Watson is currently being groomed to help doctors make diagnoses, particularly in the third world where there is a huge lack of doctors. IBM has also started selling time on the machine to anybody and there is no telling all of the ways it is now being used.

Probably the most interesting current research is in teaching computers to learn on their own. This is done today by enabling multiple levels of ‘neurons’. The first layer learns the basic concept, like recognizing somebody speaking the letter S. Several first-layer inputs are fed to the second layer of neurons which can then recognize more complex patterns. This process is repeated until the computer is able to recognize complex sounds.

The computers being used for this research are already getting impressive. The Google computer that did well learning to recognize cats had a billion connections. This computer was 70% better at recognizing objects than any prior computer. For now, the breakthroughs in the field are being accomplished by applying brute computing force and the cat-test computer used over 16,000 computer processors, something that only a company like Google or Microsoft has available. .

Computer scientists all agree that we are probably still a few decades away from a time when computers can actually learn and think on their own. We need a few more turns of Moore’s Law for the speed of computers to increase and the size of the processors to decrease. But that does not mean that there are not a lot of current real life applications that can benefit from the current generation of deep learning computers.

There are real-world benefits of the research today. For instance, Google has used this research to improve the speech recognition in Android smartphones. But what is even more exciting is where this research is headed for the future. Sergey Brin says that his ultimate goal is to build a benign version of HAL from 2001: A Space Odyssey. It’s likely to take multiple approaches in addition to deep learning to get to such a computer.

But long before a HAL-like computer we could have some very useful real-world applications from deep learning. For instance, computers could monitor complex machines like electric generators and predict problems before they occur. They could be used to monitor traffic patterns to change traffic lights in real time to eliminate traffic jams. They could be used to enable self-driving cars. They could produce a universal translator that will let people with different languages converse in real-time. In fact, in October 2014, Microsoft researcher Rick Rashid gave a lecture in China. The deep learning computer transcribed his spoken lecture into written text with a 7% error rate. It then translated it into Chinese and spoke to the crowd while simulating his voice. It seems like with deep learning we are not far away from having that universal translator promised to us by science fiction.

Deceptive Billing Practices

shockIn case you haven’t looked close at your cable bill lately, there are likely a number of mysterious charges on it that look to be for something other than cable TV service. There was a day not too many years ago when a cable bill was simple. The bill would list the cable package you purchased as well as some sort of local franchise tax. There also might have been some line-item purchases if you bought pay-per-view movies or watched wrestling or other pay-per-view events.

But cable bills have gotten a lot more complicated because cable companies have been slyly introducing new charges on their bills in an effort to disguise the actual price of their basic cable packages. Here are a few of the charges I have heard about or seen on recent cable bills:

  • Broadcast TV Fee. This is a new fee where cable companies are putting some of the increases that they are having to pay for access to the broadcast networks of ABC, CBS, Fox and NBC. You can sympathize some with the cable operators on this fee since a decade ago cable companies got to carry these networks for free. But the network owners finally woke up to the fact that they could charge retransmission fees and since then the rates for carrying these networks has grown to roughly $2 per network, per customer, per month. But still, these fees ought to be part of basic cable, which is the smallest package that includes the core channels and that must be then carried with every other cable package.
  • Sports Programming Fees. It’s debatable whether sports programming or local retransmission fees have grown the most over the last decade. Certainly there was a day when there was only ESPN and a handful of other minor sports channels. But now cable systems are packed full of sports channels and each of them raises rates significantly every year to pass on the fees they pay to sports leagues to carry their content. The problem with starting a new fee to cover some of the increases in sports programming is that it clearly foists the cost of sports programming on everybody, when surveys show that a majority of customers are not very interested in sports outside of maybe the NFL.
  • Public Access Fee. In many cities the cable companies are required to carry channels that cover local government meetings and other local events. Other than having to reserve a slot on the cable system there is normally not much actual cost associated with these channels. So it’s incredibly cynical for a cable company to invent a fee to charge people to watch a channel that the cable company has agreed to carry, and for which they have very little cost.
  • Regulatory Recovery Fee. This one has me scratching my head since most cable companies are lightly regulated and pay very few taxes other than franchise fees, which they already put directly onto people’s bills. This fee seems to be pure deception to make people think they are paying taxes, when instead this is a fee that the cable company pockets.

Additionally, cable companies have recently really jacked up the cost of both settop boxes and cable modems. Interestingly, the actual cost of settop box cost at $80 – $100 has dropped over the last decade and continues to drop. It’s the same with cable modems. It’s hard to justify paying a monthly fee of up to $9 for a cable modem box that probably costs $80. Customers can theoretically opt out of both of these charges, but the large cable companies make it really hard to do so.

The idea of misnamed fees has been around for a while and started with telephone service. Starting back in 1984 the FCC allowed the telcos to migrate some of the charges that they used to bill to long distance companies for using the local loop to homes to a fee directly assessed on customers. Since then, telcos have had a separate fee called a Subscriber Line Charge, or an Access Fee, or sometimes an FCC Fee on their bills. But this was never a tax, as most customers assume, and the telco simply pockets this money as part of local rates. When the cable companies got into the voice business they largely copied this same fee, even though they never had to make the same shift of access revenues that created the charge. The FCC ought to do away with this fee entirely and require it be added to local rates where it belongs.

I think perhaps one of the reasons that the cable companies are so against Title II regulation is that these kinds of billing practices then come under FCC scrutiny. It’s hard to think of these various fees as anything other than outright deception and fraud. The companies that charge them are trying to be able to say in advertising that their rates are competitive, when in fact, by the time you add on the various ‘fees’, the actual cost for their products are much higher than what they advertise. I’m also surprised that the FTC has not gone after these fees since they are clearly intended to deceive the general public about what they are buying.

You might sympathize with the cable companies a little in that they have been bombarded year after year with huge increases in the cost of programming. But my sympathy for them evaporates once I look at the facts. When their programming costs go up each year they always raise their rates considerably more than the increased cost of programming and they use rate increases to increase their profit margin. Additionally, for the largest cable companies, part of those rate increases are for programming they own, such as the local sports networks.

We all know that the cost of cable is going to drive a lot of households to find a cheaper alternative, and when that happens the cable companies have to shoulder a lot of the blame. People might not understand the line items on their bill, but they know that the size of the check they write each year gets a lot bigger, and that is all that really matters.

The Battle of the Routers

Cisco routerThere are several simultaneous forces tugging at companies like Cisco which make network routers. Cloud providers like Amazon and CloudFlare are successfully luring large businesses to move their IT functions from local routers to large data centers. Meanwhile, other companies like Facebook are pushing small cheap routers using open source software. But Cisco is fighting back with their push for fog computing which will place smaller function-specific routers near to the source of data at the edge.

Cloud Computing.

Companies like Amazon and CloudFlare have been very successful at luring companies to move their IT functions into the cloud. It’s incredibly expensive for small and medium companies to afford an IT staff or outsourced IT consultants, and the cloud is reducing both hardware and people costs for companies. CloudFlare alone last year announced that it was adding 5,000 new business customers per day to its cloud services.

There are several trends that are driving this shift to data centers. First, the cloud companies have been able to emulate with software what formerly took expensive routers at a customer’s location. This means that companies can get the same functions done for a fraction of the cost of doing IT functions in-house. The cloud companies are using simpler, cheaper routers that offer brute computing power which also are becoming more energy efficiency. For example, Amazon has designed all of the routers used in its data centers and doesn’t buy boxes from the traditional router manufacturers.

Businesses are also using this shift as an opportunity to unbundle from the traditional large software packages. Businesses historically have signed up for a suite of software from somebody like Microsoft or Oracle and would live with whatever those companies offered. But today there is a mountain of specialty software that outperforms the big software packages for specific functions like sales or accounting. Both the hardware and the new software are easier to use at the big data centers and companies no longer need to have staff or consultants who are Cisco certified to sit between users and the network.

Cheap Servers with Open Source Software.

Not every company wants to use the cloud and Cisco has new competition for businesses that want to keep local servers. Just during this last week both Facebook and HP announced that they are going to start marketing their cheaper routers to enterprise customers. Like most of the companies today with huge data centers, Facebook has developed its own hardware that is far cheaper than traditional routers. These cheaper routers are brute-force computers stripped of everything extraneous and that have all of their functionality defined by free open source software; customers are able to run any software they want. HP’s new router is an open source Linux-based router from their long-time partner Accton.

Cisco and the other router manufacturers today sell a bundled package of hardware and software and Facebook’s goal is to break the bundle. Traditional routers are not only more expensive than the new generation of equipment, but because of the bundle there is an ongoing ‘maintenance fee’ for keeping the router software current. This fee runs as much as 20% of the cost of the original hardware annually. Companies feel like they are paying for traditional routers over and over again, and to some extent they are.

These are the same kinds of fees that were common in the telecom industry historically with companies like Nortel and AT&T / Lucent. Those companies made far more money off of maintenance after the sale than they did from the original sales. But when hungry new competitors came along with a cheaper pricing model, the profits of those two companies collapsed over a few years and brought down the two largest companies in the telecom space.

Fog Computing.

Cisco is fighting back by pushing an idea called fog computing. This means having limited-function routers on the edge of the network to avoid having to ship all data to some remote cloud. The fog computing concept is that most of the data that will be collected by the Internet of Things will not necessarily need to be sent to a central depository for processing.

As an example, a factory might have dozens of industrial robots, and there will be monitors that constantly monitor them to spot troubles before they happen. The local fog computing routers would process a mountain of data over time, but would only communicate with a central hub when they sense some change in operations. With fog computing the local routers would process data for the one very specific purpose of spotting problems, which would save the factory-owner from paying for terabits of data transmission, while still getting the advantage of being connected to a cloud.

Fog computing also makes sense for applications that need instantaneous feedback, such as with an electric smart grid. When something starts going wrong in an electric grid, taking action immediately can save cascading failures, and microseconds can make a difference. Fog computing also makes sense for applications where the local device isn’t connected to the cloud 100% of the time, such as with a smart car or a monitor on a locomotive.

Leave it Cisco to find a whole new application for boxes in a market that is otherwise attacking the boxes they have historically built. Fog computing routers are mostly going to be smaller and cheaper than the historical Cisco products, but there is going to be a need for a whole lot of them when the IoT becomes pervasive.

If You Think You Have Broadband, You Might be Wrong

Speed_Street_SignThe FCC has published the following map that shows which parts of the country they think have 25 Mbps broadband available. That is the new download speed that the FCC recently set as the definition of broadband. On the map, the orange and yellow places have access to the new broadband speed and the blue areas do not. What strikes you immediately is that the vast majority of the country looks blue on the map.

The first thing I did, which is probably the same thing you will do, is to look at my own county. I live in Charlotte County, Florida. The map shows that my town of Punta Gorda has broadband, and we do. I have options up to 110 Mbps with Comcast and I think up to 45 Mbps from CenturyLink (not sure of the exact speed they can actually deliver). I bought a 50 Mbps cable modem from Comcast, and they deliver the speed I purchased.

Like a lot of Florida, most of the people in my County live close to the water. And for the most parts the populated areas have access to 25 Mbps. But there are three urban areas in the County that don’t, which are parts of Charlotte Beach, parts of Harbor View and an area called Burnt Store.

I find the map of interest because when I moved here a little over a year ago I considered buying in Burnt Store. The area has many nice houses on large lots up to five acres. I never got enough interest in any particular house there to consider buying, but if I had, I would not have bought once I found there was no fast broadband. I don’t think I am unusual in having fast Internet as one of the requirements I want at a new home. One has to think that in today’s world that housing prices will become depressed in areas without adequate Internet, particularly if they are close to an area that has it.

The other thing that is obvious on the map of my county is that the rural areas here do not have adequate broadband, much like most rural areas in the country. By eyeball estimate it looks like perhaps 70% of my county, by area, does not have broadband as defined by the FCC. Some of that area is farms, but there are also a lot of large homes and horse ranches in those areas. The map tells me that in a county with 161,000 people that over 10,000 people don’t have broadband. Our percentage of broadband coverage puts us far ahead of most of the rest of the country, although the people without broadband here probably don’t feel too lucky.

I contrast the coasts of Florida by looking at the Midwest. In places like Nebraska it looks like nobody outside of decent sized towns has broadband. There are numerous entire counties in Nebraska where nobody has access to 25 Mbps broadband. And that is true throughout huge swaths of the Midwest and West.

There are pockets of broadband that stick out on the map. For example, there is a large yellow area in rural Washington State. This is due to numerous Public Utility Districts, which are county-wide municipal electric systems, which have built fiber networks. What is extraordinary about their story is that by Washington law they are not allowed to offer retail services, and instead offer wholesale access to their networks to retail ISPs. It’s a hard business plan to make work, and still a significant amount of fiber has been built in the area.

And even though much of the map is blue, one thing to keep in mind that the map is overly optimistic and overstates the availability of 25 Mbps broadband. That’s because the database supporting this map comes from the National Broadband Map, and the data in the map is pretty unreliable. The speeds shown in the map are self-reported by the carriers who sell broadband, and they frequently overstate where they have coverage of various speeds.

Let’s use the example of rural DSL since the delivered speed of that technology drops rapidly with distance. If a telco offers 25 Mbps DSL in a small rural town, by the time that DSL travels even a mile out of town it is going to be at speeds significantly lower than 25 Mbps. And by 2–3 miles out of town it will crawl at a few Mbps at best or not even work at all. I have helped people map DSL coverage areas by knocking on doors and the actual coverage of DSL speeds around towns looks very different than what is shown on this map.

Many of the telcos claim the advertised speed of their DSL for the whole area where it reaches. They probably can deliver the advertised speeds at the center of the network near to the DSL hub (even though sometimes this also seems to be an exaggeration). But the data supplied to the National Broadband Map might show the same full-speed DSL miles away from the hub, when in fact the people at the end of the DSL service area might be getting DSL speeds that are barely above dial-up.

So if this map was accurate, it would show a greater number of people who don’t have 25 Mbps broadband available. These people live within a few miles of a town, but that means they are usually outside the cable TV network area and a few miles or more away from a DSL hub. There must be many millions of people that can’t get this speed, in contradiction to the map.

But the map has some things right, like when it shows numerous counties in the country where not even one household can get 25 Mbps. That is something I can readily believe.

The FTC and Technology

federal-trade-commission-ftc-logo_jpgLast week I wrote about how the Federal Trade Commission was going to start watching the Internet of Things. I will admit that this is maybe only the second or third time in my career that I can recall the FTC being involved in anything related to telecom. So I did some digging and I think we are going to be hearing about them a lot more. The FTC is turning into one of the primary watchdogs of technology.

The FTC was created by President Woodrow Wilson in 1914 to fight against big trusts. In those days large corporations like Standard Oil and America Tobacco held monopoly power in their industries. The Sherman Act was passed as a way to battle the largest monopolies, but Congress wanted a second mechanism to control the worst practices of all corporations. The FTC was created 100 years ago to protect consumers against the practices of large corporations.

The FTC got their powers expanded in 1938 when Congress gave them explicit authority to combat “unfair methods of competition”. Since then the agency became increasingly active in protecting the public against unfair trade practices.

It is not surprising to see the FTC getting involved with technology since it is becoming the primary way that companies interface with people. The FTC has been engaged for years in a few areas that involve the telecom industry. For instance, they have been the watchdog for years for issues like deceptive advertising, poor billing practices, and violations of customer privacy.

As an example, there have been a number of FTC actions over the years with AT&T. Not that I particularly want to single out AT&T, because the FTC has been engaged with all of the large carriers over the years. However, just last year the FTC got AT&T to refund $80 million to wireless customers who had been crammed with fraudulent third party charges. In 2009, the FTC faulted the company for denying phones to people based upon having poor credit since they had not explained the policy to the public. And now the FTC is going after AT&T for fraudulent advertising since their unlimited mobile data plans are not actually unlimited.

One area of FTC focus for the last few years has been the security of customer data. For example, they have fined a number of companies that had security breaches that released customer credit card and other personal information if those companies had not taken reasonable precautions to protect the data.

While companies sometimes fight the FTC, the more normal response is for the agency and a company to come to a mutually acceptable change in behavior through a consent decree. Following are a few cases related to our industry that were not amicably resolved and that instead resulted in suits by the FTC to stop bad corporate behavior:

  • Amazon. Last year the FTC sued Amazon to get them to stop the practice where children could rack up huge bills on cell phones by purchasing add-ons for computer games without parental approval. There were even game apps for pre-school age kids who clearly cannot yet read that allowed a player to buy extra features of the game by hitting a button.
  • Snapchat. Last year the FTC sued Snapchat because they told customers that their data on the network was private and protected, while it wasn’t.
  • Dish Network. In 2012 the FTC sued Dish Network for making telemarketing calls in violation of the Do Not Call rules.
  • Robocalling. In 2009 the FTC sued to stop numerous companies who were using robocalls to sell fraudulent products.
  • Data Brokers. The FTC sued LeapLab of Arizona for selling consumer data that included details like bank account numbers.
  • Spam. The FTC took legal steps to shut down Triple Fiber Networks ( which hosted huge quantities of spam emails.
  • Intel. In 2009 the FTC sued Intel for using its monopoly power to artificially inflate the cost of computer chips.

As privacy and data security become even more important, we will probably see the FTC become very active in our industry. Interestingly, most of the FTC’s work is done quietly and without press. It contacts companies against which there are multiple public complaints. They generally investigate the complaints and try to get companies to change their bad behavior. And most companies agree to make changes. But the FTC has the ability to levy large fines and will do so for companies who repeat bad behavior or who violate a prior consent decree.

What Does a Gigabit Get Us?

pro_MC220L-01This is the sort of blog I really like because it talks about the future. Last fall the Pew Research Center asked a number of industry experts what ubiquitous gigabit bandwidth would do for society. Since then there have been numerous articles written about the changes that might come with faster bandwidth. Interestingly, these are not distant Star Trek fantasies; industry experts are expecting these ideas to manifest in a decade or so. Following are some of the more interesting ideas that I’ve seen:

Enabling Hermits Everywhere. A large number of experts believe that one of the first and most practical aspects of gigabit bandwidth will be telepresence, which means the ability to meet with people holographically and feel like you are in the same room. This would largely eliminate business travel because people could meet together at any time as long as they are all connected with gigabit bandwidth.

This same technology also means you could sit for an evening with a remote family member, meet with a doctor, get a piano lesson, or do almost anything that involves meeting with somebody else without needing physical interaction. This will enable even the biggest hermits among us to interact from the safety of our living rooms. (But it will also change the way we dress when we work from home!)

I have read predictions that this is going to mean that we do away with emails, phone calls, and other methods of communications, but I don’t buy that. It’s human nature to not always want to communicate in real time with people and I think telepresence is going to make us very careful about who we let into our lives. I suspect we will become very selective about who we will share our presence with and that we won’t let salespeople and strangers into our telepresence.

Holodecks? Big bandwidth ought to bring about new forms of entertainment. If we can sit holograhically in a meeting we can also holograhically attend a concert, take a ride on a gondola in Venice, or sit on the beach in the Caribbean. It also means a huge leap forward in gaming where we can become characters within a game rather than controlling characters from without. And I am guessing that the sex industry will probably be one of the earliest to monetize these abilities.

The Ever-present Infosphere. Huge bandwidth coupled with the cloud and supercomputers means that we can have a computerized world with us anywhere there is bandwidth. This will eventually do away with computers, smartphones and other devices since the infosphere will always be there. We will have multiple screens and holographic projectors in the home and some future indiscrete wearable when away from home. We will each have a useful personal assistant that will help us navigate in a gigabit world.

The Internet of Things Becomes Useful. Rather than just having a smart thermometer and a door that we can unlock with our smartphones, we will be surrounded by devices that will tailor to our individual needs to create the environment we want. We will be constantly medically monitored and will be far healthier as a result.

Just-in-time Learning. With the infosphere always around us we will be able to access the facts we need when we need them. This will revolutionize education because we will have access to all of the ‘how-to’ manuals in the world and we will have a personal assistant to use them. This makes a lot of traditional education obsolete because everybody will be able to learn at their own pace. There might not be home-schooling, but rather personal assistant schooling. Obviously there will still need to be traditional types of training for specialties and physical skills. But the idea of needing to sit through months-long classes will become obsolete for most topics. This also will make education ubiquitous and a motivated person from anywhere on the planet and from any walk of life can learn whatever they want.

Always Monitored. Privacy will become a major issue when everything we do is being monitored. This can go one of two ways and we will either all adapt to living in a monitored society, or else there will be a outcry for a technological solution for guaranteeing our privacy. How this one issue is resolved will have a huge impact on everything else we do.

Something Unexpected. Many experts predict that ubiquitous bandwidth will probably not bring us only the things we expect, but rather things that we have not yet imagined. Who, just a decade ago, really understood the impact of smartphones, social media, and the other applications that are forefront in our lives today? It’s likely than many of the things listed above will happen, but that the most important future developments aren’t even on that list.

The Digital Divide Becomes Critical. Those without bandwidth are quickly going to be left out of the mainstream of the new society that is going to rely on gigabit tools for daily life. This will probably drive communities to find ways to get fiber at any cost, or else look at being left far behind. But we also might see some people drop out of the gigabit world and have segments of the population who refuse to partake in the bandwidth-driven future. One also has to wonder how we will cope when we lose the infosphere due to hurricanes or other acts that kill our connectivity for an extended period of time. Will we become too dependent upon the infosphere to function well without it?


Non-Human Traffic Dominates the Web

Incapsula-logo-widgetIncapsula did their third annual survey of web traffic to determine how much is human generated versus machine generated. From August through September, 2014, they surveyed over 15 billion visits to over 20,000 web sites scattered around the world.

What they found will probably surprise the average person (but not any web administrator). For the third year in a row there was more traffic generated on the web by bots than was generated by people. There are both good and bad bots and they looked at each transaction to determine the nature of the bot. In 2014, 44% of all web traffic was generated by humans, 29% from bad bots and 27% by good bots.

So what are bots exactly? There are many examples of good bots. Probably the best known is the Google web crawler that reads through web sites to build the Google search engine. All search engines have similar bots, but Incapsula says that the Google bot is unique in that it seems to crawl through everything – big web sites, small web sites and even dead web sites, and this certainly accounts for why you can find things on Google search that don’t turn up anywhere else.

Another example of a good bot can be seen when you go to a shopping site. If you’ve ever shopped for electronics you will find a bunch of these sites. They list all of the places on the web that are selling a given component and let you compare prices. These sites are built by bots that crawl through the electronics sellers to constantly grab any updates. These sites do this to earn sales commissions when people choose to buy something through their site.

Another big category of good bots are RSS feeds. This stands for Really Simple Syndication. I used this technology for years. It was a way to know if somebody wrote a new blog or if a news site published an article on a topic of interest to you. The RSS bot would notify you when they found something you were looking for. There was a 10% drop from 2013 to 2014 in good bot traffic due to the phase-out of RSS feeds. Google Reader was the biggest source of such feeds and it was discontinued last year.

What is scary is the ever-growing volume of bad bots. These are just what you would imagine, and are crawling around the web trying to do damage.

The fastest growing class of bad bots are impersonator bots, which are malware that tries to look like something else to make it onto a web site or computer. These include DDoS (denial of service) bots that are disguised to look browser requests, bots that are disguised as proxy server requests, and bots that mimic search engine crawls. These are really nasty pieces of malware on the net that are used for things like data theft, site hijacking, and denial of service attacks. These bots go after all types of web sites hoping to then infect site visitors.

Probably the biggest volume of bad bot traffic comes from scrapers. These are bots that are designed to grab certain kinds of information. The good bot listed above that compares electronics prices is a kind of web scraper. But the malicious web scrapers look to steal things of value such as passwords, email addresses, phone numbers, credit card numbers, or other kinds of data that can then help hackers better attack somebody.

Of course we all know about the next category of spamware which is used for all sorts of malicious purposes like content theft, phishing, and identity theft.

The final category of bad bots are categorized as hacking tools; these are generally aimed at servers rather than computers. Hacking tools are used to crack into servers to steal corporate data, to steal credit card information, or to crash the server.

Incapsula found that bad bots attack web sites of all kinds and that there are proportionately more bad bots trying to crack small web sites than large ones. This is probably due to the fact that the vast majority of web sites have less than 1,000 visitors per day and are often much less protected than larger corporate sites.

What does this all mean for an ISP? The ISP uses tools to try to intercept or deflect as much of the bad bot traffic as possible. ISPs try to keep malware off customers’ computers since one of the biggest threats to their network are attacks from within. Accumulated malware on customers’ computers can play havoc within the network and inside firewalls.

There are companies like Incapsula that sell tools for ISPs to monitor and block bad bot traffic. But the volume of bot attacks is so large these days that it’s often a losing game. For example, Incapsula says that during a denial of service attack, when large volumes of bots attack the same site simultaneously, as many as 30% of the malware attached to the attacking bots gets through any normal malware protection schemes.

To some degree the bad guys are winning, and if they get far enough ahead it could be a threat to the web. The worst of the bad bots are written by a handful of very talented hackers and the industry is currently stepping up pursuit of these hackers as a strategy to cut off bot attacks at their sources.

Verizon’s Strategy

Verizon2The news coming out of Verizon lately is really interesting and set me to musing about their long-term strategy.

First, they are selling off $10 billion in landlines in Texas, California, and Florida to Frontier. These properties include 3.7 million voice lines, 2.2 million high-speed data customers, and around 1.6 million FiOS customers. This divests their FiOS service everywhere except the east coast. The 1.6 million customers represents a very significant 24% of the reported 6.6 million FiOS customers at the end of 2014. A few weeks ago Verizon had also announced an end to any further expansion of FiOS.

It’s been clear for years that Verizon has wanted out of the copper business. They first sold off large portions of New England to Fairpoint. Then in 2010 they sold a huge swath of lines in fourteen states to Frontier including the whole state of West Virginia. And now comes this sale. It’s starting to look like Verizon doesn’t want to be in the landline business at all, perhaps not even in the fiber business.

After all, this latest selloff was done to finance another big chunk of wireless spectrum. When Verizon CEO Lowell McAdam announced the landline sale he said that the company would be focusing on its 108 million wireless customers. One can see the emphasis on wireless in the company just by looking at their annual reports. One has to go many pages deep to see a discussion of the landline business and most of the report talk about the wireless business.

McAdam said that the company was going to put its emphasis on selling data and video to LTE customers. McAdam repeated a past announcement that Verizon would be rolling out an online video package later this summer and he hinted that the service would include a significant number of networks when launched. They plan to sell the new video packages to both their wireless customers and to anybody online.

I find several things about Verizon’s decisions to be very interesting:

  • One has to wonder how Verizon will deliver a lot of video programming through the cellular network. Certainly LTE has enough speed to deliver video, and most urban LTE network tests come in between 10 Mbps and 20 Mbps. But the issue in the cellular network is not speed, but overall capacity from a given cell site. Perhaps some of the new spectrum they are buying will be used strictly for this purpose to beef up capacity. But I find it a bit ironic that Verizon would now be pushing such a data-heavy network application when just a few short years ago they claimed that network congestion was the reason they needed to impose skimpy monthly data caps.
  • You also have to wonder how they are going to reconcile this product with their existing data caps. An hour of video streaming can use a gigabit of bandwidth and so it won’t take very much video viewing to hit the existing data caps. Verizon stopped selling unlimited data plans in 2012. They throttle the top 5% of unlimited 3G users and threatened last fall to do the same thing to LTE customers, but backed down after a lot of pushback. The majority of their customers have low caps that are not going to match up well with cellular video products.
  • Perhaps they were hoping to exempt their own video product from data caps, but that would violate the impending new net neutrality rules which won’t allow favoring your own product over those from other video providers. So perhaps Verizon is going to go back to unlimited data plans or at least raise the caps significantly. But doing that will allow Netflix and others to compete on cellphones.
  • One also has to wonder how they will keep up with the inevitable trend for bigger bandwidth video. Will wireless networks really be able to deliver 4k video and the even bigger 8k bandwidth products that will inevitably follow?
  • In general, one has to be curious about their obvious desire to be only a wireless company. The general trend in the cellular industry is towards lower prices. I know I was able to cut my own cellphone plan price almost in half this past year, and the trend is for prices to keep going lower. Certainly having companies like Google enter the market is going to push prices lower. Also, Cablevision announced a cellphone plan that mostly uses WiFi and that will only dip into the cellular network as a fallback. Comcast and others are considering this and it could produce significant competition for Verizon.
  • This announcement also tells me that they see profits in selling over-the-top video. It’s well known that nobody makes much money selling the huge traditional cable lineups, but Verizon obviously sees better margins in selling smaller packages of programming. But will margins remain good for online video if a lot of companies jump into that business?

I scratch my head over selling off FiOS. Verizon reports an overall 41% market penetration for its data product on FiOS networks. Data has such a high profit margin that it’s hard to think that FiOS is not extremely profitable for them. The trend has been for the amount of data used by households to double every three years, and one doesn’t have to project that trend forward very far to see that future bandwidth needs are only going to be met by fiber or by significantly upgraded cable networks. Landline networks today deliver virtually all of the bandwidth that people use. There are now more cellular data dips than landline data dips, but people rely on their landline connection for any application that uses significant bandwidth.

Verizon was a market leader getting into the fiber business. FiOS was a bold move at the time. It’s another bold move to essentially walk away from the fiber business and concentrate on wireless. They obviously think that wireless has a better future than wireline. But since they are already at the top of pile in cellular one has to wonder where they see future growth? One has to admit that they have been right a lot in the past and I guess we’ll have to wait a while to see if this is the right move.

What is Quantum Computing?

cats-animals_00419529A week ago one of my blogs mentioned a new way to transmit the results of quantum computing. I’ve been following quantum computing for a few years and that announcement led me to take a fresh look at the latest in the quantum computing field.

We all have a basic understanding of how our regular computers work. From the smallest chip in a fitness wearable up to the fastest supercomputer, our computers are Turing machines that convert data into bits represented by either a 1 or a 0 and then process data linearly through algorithms. An algorithm can be something simple like adding a column of numbers on a spreadsheet or something complex like building a model to predict tomorrow’s weather.

Quantum computing takes advantage of a property found in subatomic particles. Physicists have found that some particles have a property called superposition, meaning that they operate simultaneously in more than one state, such as an electron that is at two different levels. Quantum computing mimics this subatomic world by creating what are called qubits which can exist as both a 1 and a 0 at the same time. This is significant because a single qubit can perform two calculations at once. More importantly, though, qubits working together act exponentially. Two qubits can perform four calculations at once, three can perform eight calculations and a thousand qubits can perform . . . a lot of calculations at the same time.

This is intriguing to computer scientists because there are a number of challenges that need more computing power than can be supplied by even the fastest Turing computers. This would include such things as building a model that will more accurately predict the weather and long-term climate change. Or it might involve building a model that accurately mimics the actions of the human brain in real time.

Quantum computers should also be useful when looking at natural processes that have some quantum mechanical characteristics. This would involve trying to predict complex chemical reactions when designing and testing new drugs, or designing nanoparticle processes that operate at an atomic level.

Quantum computers also should be good at processes that require trying huge numbers of guesses to find a solution when each guess has an equal chance of being correct. An example is cracking a password. A quantum computer can try all of the possible combinations quickly while a normal computer would toil away for hours plugging in one choice after another in a linear fashion.

Quantum computing is in its infancy with major breakthroughs coming only a few years ago. Scientists at Yale created the first qubit-based quantum computing processor in 2009. Since then there have been a few very basic quantum computers built that demonstrate the potential of the technology. For instance, in 2013 Google launched the Quantum Artificial Intelligence Lab, hosted by NASA’s Ames Research Center, using a 512 qubit computer built by D-Wave.

For the most part, the field is still exploring the basic building blocks needed to build larger quantum computers. There is a lot of research looking at the best materials to use to produce reliable quantum chips and the best techniques for both programming and deciphering the results of quantum computing.

There are numerous universities and companies around the world engaged in this basic research. Recently, Google hired John Martinis and his team from the University of California at Santa Barbara. He is considered one of the foremost experts in the field of quantum computing. Martinis is still associated with the UCSB but decided that joining Google gave him the best resources for his research.

The NSA is also working on quantum computers that will be able to crack any codes or encryption. Edward Snowden released documents that show that the agency has two different initiatives going to produce the ultimate code-breaking machine.

And there are others in the field. IBM, Microsoft, and Facebook all are doing computer research that includes quantum computing techniques. It’s possible that quantum computing is a dead end that won’t produce results that can’t be obtained by very fast Turing computers. But the theory and early prototypes show that there is a huge amount of potential for the new technology.

Quantum computers are unlikely to ever make it into common use and will probably be limited to industry, universities or the government. A quantum computer must be isolated from external influences and will have to operate in a shielded environment. This is due to what is called quantum decoherence, which means that that just ‘looking’ at a quantum component by some external influence can change its state, in the same manner of opening the box determines the state of Schrodinger’s cat. Quantum computing brings quantum physics into the macro world, which is both mystifying and wonderful.

The Crazy World of Web Advertising

auction-hammerLately you might have noticed that while you are browsing the web that you are experiencing a big delay in loading some web pages. A web page will pop up on your browser, but then it will sit for a while before it lets you navigate the page. The delay can last for many seconds and always feels longer than it is.

These delays are due to web advertisers. Web sites contain two kinds of advertising. There are embedded ads that a web site owner puts onto their site and lock. Embedded ads cannot be accessed or changed by an outside party and are integrated into the web page. But there are also remnant ads, which are ads that fit into blank spaces left for that purpose by the website owner. It is the process of putting ads into these remnant spaces that is causing the delay in loading pages.

There are a lot of companies that sell advertising into the remnant ad space including: Google (Doubleclick), Yahoo, Amazon, Facebook, AOL, AppNexus, Openx, Adroll, RightMedia, and dECN. The whole process is fascinating and is not much talked about outside of the advertising world.

These companies compete to put their ads into the remnant spaces. Each of these companies sell internet advertising and the remnant ads are where they are able to place most of their ad inventory. I always assumed that the web advertisers made deals with the popular web sites to place ads on those pages. But it doesn’t happen that way at all.

When somebody puts a remnant ad space on a web site it is open real estate and any advertiser that gets access to that web site gets to put advertising into the slots. So all of these advertisers play a game of real estate grabbing to try to be the first company to get to the remnant ad space.

Some advertisers dominate this market because they are associated with a web service that has lot of eyeballs. For example, if you go to a web page through the Google search engine or through another Google site like YouTube, then Doubleclick is going to grab the opportunity to place the ads. This is why the Google search engine is so lucrative. They charge companies to get a top ranking from searches, and they make more money placing ads on those sites that contain remnant ad space.

But the same is true for other large companies. If you go to a web page from Facebook or Yahoo or AOL then those companies are first in line to grab the remnant ad spaces. So a web site will allow any ad companies depending upon how the customer came to be at that site. As an aside, this is why it’s big news when some web service says they are changing browsers, because it means a big shift of dollars between these advertisers.

But web pages don’t run slow because of the primary advertiser on the web page. If Doubleclick gets to a web site and has enough ads in its inventory to fill all of the ad spaces the process is fast and nearly imperceptible to the end-user. The delays happen when the primary advertiser doesn’t have enough ads to fill all of the remnant ads on a given page.

Ad companies will only place ads that they have pre-sold. Since there are billions of requests per day to fill remnant ad spaces there will be many times when an advertising company doesn’t have enough ad inventory to fill a given page at a given moment. At that point, the first advertiser in line will fill the slots they can, and then they will send the ad to what is called the ad exchange. This is a consortium of all of the web advertisers.

At the ad exchange the remaining open remnant ad spaces are offered at auction to the highest bidder. Money is exchanged from these auctions through a process called the ad trading desk. So let’s say in this example that OpenX purchases the open slots on the web site in question from the ad auction. That gives them temporary control of the web site and they place ads in the remaining slots.

But if OpenX doesn’t have enough ads to fill all of the remaining slots, it goes back to the ad exchange again and there is another auction. Web pages can get stuck in this process and keep going back for auction, and that is when you will see a really big delay before you can navigate the page. But even when there are only a few advertisers involved the delay can be a few seconds.

There is one ugly part to this process in that the prices in the ad exchange can get really low, as in having four or five zeroes in front of the price. These low prices have lured the guys who push malware to the ad exchanges. These are the same guys who try to spread malware through spam. They will buy really cheap ads and then attach their malware to the remnant ad space. This is really insidious because an end user doesn’t have to click on the ad to get the malware – just loading the web site is enough to give them the malware. At the end of 2014 Cisco identified this ad malware, which is now called malvertising, as the second biggest source of malware in the web ecosystem.