There is so much data being thrown at us each day it's overwhelming! Email, instant messages, Web pages, blogs, YouTube, Facebook, RFID networks, texts, embedded systems in cars and the list goes on and on. All this data is collected and simply waiting to be used. Let me take a step back and define what I mean.
The Wikipedia definition for Big Data is:
"Big data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set."
What is happening is that enterprises are collecting exceptionally large volumes of data, in multiple formats from multiple sources. The volume of data is increasing, as is access to it with the evolving interconnected world of social networks, sensor networks, customer chat sessions, etc. In fact everything is available on the Internet somewhere. For example the United States Library of Congress and Twitter just signed an agreement (Dec. 7) that will see an archive of every public Tweet ever tweeted added to the library's repository (note to self be careful what I tweet).
The value, however, is NOT in the data, it is in the analytics of the data to make effective business decisions from the intelligence that is identified within the information. Prediction 3 for 2012 is that origanizations will commence mining the information for value to make effective decisions.
Potential value propositions could include early indications of trends that are working to move new products quickly to or from them market. For instance if a new product is working within a certain demographic you can rapidly target like demographics. Now in the past this information may have taken days or weeks. With big data aggregation and analytics, it could be available in minutes or hours.
More than ever before analytics of the information will be critical, but like all processes using data, garbage in will probably lead to garbage out, unless you can identify the garbage!
This blog also appears at CA Technologies Service Management blog.