Trash is the future for big data

Trash is the future for big dataThe New York Times just published an interesting article about big data.

They make two points. Firstly, Big Data companies are attracting stellar valuations. Secondly, the article contains a revealing quote from Mark Grether, Global Chief Operating Officer of Xaxis…

The bigness of big data can also represent a challenge to extract more meaning more quickly from the massive amounts of anonymous big data being generated on a moment by moment basis.

Filter the trash

Walmart currently creates about two Petabytes of data an hour to help improve the efficiency of its marketing. This is manageable with current Big Data strategies and technologies. However, we all know that digital buying behaviour is increasing in complexity on an almost exponential scale.

That’s why I believe that the most valuable future for Big Data in business is to have a strategy and filter out the “trash” from the “gold-dust” as early as you can in the data collection stages. There is compelling support for this viewpoint from the world of academia.

Filtering works

CERN’s Large Hadron Collider generates 1 Petabyte of data every second. That’s 1 million gigabytes every second.

On a daily basis, CERN’s computing grid processes 1 Petabyte of information. This means that 86,399 Petabytes of data are discarded every day because it just isn’t practical to keep them.

With a robust data collection and filtering strategy, CERN discovered the Higgs Boson in an efficient manner, solving one of the biggest challenges in science for decades. CERN uses its cloud infrastructure extremely smartly and could be a good organisation to think about as a benchmark for future Enterprise success with Big Data.

Data filtering to get to information will become a winning capability

As digital consumption grows, volume increases in the creation, storage and analysis of data will drive the cost of storage down and the complexity of data capture and filtering up. Strong data sifting is set to become a core business capability for tomorrow’s market leaders.