We’re Drowning in Data

The analytic company IDC says the U.S. economy will be generating 394 trillion zettabytes of data annually by 2028 (a zettabyte is a trillion gigabytes). The majority of the energy used in data centers today is for storing some of this data in an accessible format. We don’t try to make all data available, and about 20% of the data we generate today is considered to be “hot data” that AI systems might want to draw on quickly. The remaining 80% of data is “cold data”, which we don’t put in data center storage, but which we also don’t discard, since it might still be of use in the future.

Today, hot data is largely stored on hard drives in data centers. This storage for quick retrievable uses a lot of electricity to operate the hard drives, and additional energy is used to cool the data center to offset the heat generated by the electronics. There is a growing trend of storing cold data on magnetic tapes, which also require energy for heating and cooling, since tapes are best stored at temperatures between 61 and 77 degrees. Tapes must also be replaced every 15-20 years by transferring the data – an intensive work effort.

The need to keep so much data at our fingertips to support AI means that we are literally drowning in data, and the problem is growing quickly every year. The solution to this is to find other ways to store massive amounts of data that don’t require a lot of electricity. There are several potential data storage methods on the horizon, and we’re going to need more.

One interesting possibility comes from Peter Kazansky, working at the University of Southampton in the UK. Back in 1999, while working with scientists at Kyoto University, Kazansky encountered a physical phenomenon that might provide the future for long-term data storage. The team at Kyoto found that when writing on glass with ultrafast femtocell lasers (a light pulse every quadrillionth of a second), the light traveling through the glass scattered in a way they could not explain.

It turns out that the researchers had discovered hidden nanostructures within silica glass created by micro-explosions from the lasers. The lasers had created tiny holes 1,000 times smaller than a human hair throughout the glass. The eureka moment came when researchers realized they could take advantage of this phenomenon by using lasers to print complex patterns inside the glass. After many years of research, Kazansky found that he could etch patterns in the glass that could store data in 5-dimensions – the normal x,y, and z coordinates, plus two additional coordinates related to voxels, or the scattering pattern of light.

This allows for storing massive amounts of data on a piece of etched silica glass. A 5-inch glass platter (slightly larger than a music CD) can store up to 360 terabytes of data. Unlike tape or hard disk storage, it looked like this technology creates forever memory that can be stored for the future. While energy is needed to etch the glass and encode the stored data, the process of reading the data uses light and is not energy-intensive. Kazanky founded SPhotonics in 2024 to commercialize the new storage method. Currently, the data can be retrieved at a speed of 30 MB per second, but he sees a path to reach 500 MB per second, which is faster than retrieving data from tapes.

Of course, storing data on etched glass is not without peril. A disc can break, and a fire or other disaster at a storage facility could destroy massive amounts of data, so most data will have to be stored at multiple sites. But at least the raw materials for silica glass are cheap and readily available. Probably the bigger issue facing the world is deciding how and when to ditch data that is no longer useful. Data scientists are already tackling this question today, but they are generally cautious and side with storing rather than destroying data if there is even a slight chance that it might be useful later.