When we think about cryptographic keys, we tend to think about closely guarded secrets. Keys are the only thing that keeps the attacker away from your encrypted data. Some keys are usually treated with the appropriate level of respect. Security professionals in the payments industry, or those that have deployed a PKI, know all too well about the importance... 

Richard Moulds

Potential Powerful Pitfalls – Big Data, Big Trouble

Potential Powerful Pitfalls – Big Data, Big Trouble

Information has always been a great source of power. Kingdoms were often won and lost through scout reports, misinformation and treachery. Even now we fear those that hold too much of it and often cry foul in the name of privacy. But if you asked me, as a collective social generation, we have forsaken our privacy willingly so we should all just calm down about it.

Data Breach Comic

Traditionally, we have collected only relevant data and stored those according to some form of sorting process so that we can make sense of all of them. For example if we wanted to know how many babies are born in Manhattan in a day we would only need to collect data from hospitals and clinics within Manhattan. This is the traditional way of collecting data because of the limited capacities of our data systems. But Big Data aims to change all of that, it is not about just collecting specific data like the number of births in Manhattan, but rather it wants to collect from each hospital from each state and the whole world as well, and not just regarding births. This can provide us with any information we need and allow us to predict future trends more accurately, or so we thought.

But Big Data is prone to some very powerful pitfalls. One is called the ‘lottery paradox,’ here we tend to give emphasis on something that is very improbable to happen to us simply because of the payoff just like in a lottery. There will always be a winner, and though chances are 175,000,000 to 1, we still hold on to the chance that we get to be that “1”. How this applies to big data is simple: the larger the data sets are the smaller the chance it is to find that piece of important information that gives out the biggest payoff. These small gems within the vast sea of data increases in number but does not increase in frequency, we simply find more of them the more data we have. So we then invest more in finding these gems, these “outrageous events”, and when we fail to find what we expect, we simply wonder if it was all worth the effort and money.

Another pitfall according to ZapThink is the ‘more is better’ paradox which is the assumption that if a certain quantity of data is good, then more of it is better. But this is not necessarily true as we might simply be encouraging the collection of more irrelevant and redundant data. Then we rely on services like Hadoop to make sense of all that chaff. But the truth is no process or software is ever capable of making sense of everything. Big Data is not about being selective in the collection process, a very big downside if you ask me.

By Abdul Salam

About CloudBuzz

Daily tech news snapshots and insights from around the world...