Passing Big Data Through A Drinking Straw

Passing Big Data Through A Drinking Straw

Passing Big Data Through A Drinking Straw

Big Data has all the corporate heads up and about in excitement since it promises to uncover golden nuggets of information out from an ocean of mundane and redundant data. But here’s the problem sticking everybody in the side, Big Data is big, as in it can reach the levels of “we-can’t-come-up-with-enough-names” bytes big. And with current upload speeds nowhere near as fast as download speeds, all the fancy analytics software and techniques aren’t going to do us any good if we can’t get our data where we need them.

bandwidth-straw

It is called the Skinny Straw or Drinking Straw problem and it is the biggest and most obvious problem being faced by Big Data. The analogy is simple; imagine passing an elephant through a drinking straw. Sure you can grind the elephant into very tiny bits so it can fit through the straw, but how long is that going to take? I admit that was a little gory, the real analogy was filling a swimming pool using a drinking straw, but you get the picture. The straw represents bandwidth and how small it is compared to the amount of data that needs to get to the other side of that straw.

The only real solution we can think of right off the bat is to get a bigger straw, but usually that would require major infrastructure upgrades on the part of the ISP or backbone provider, and we are talking about extreme amounts of cash (or credit if that’s how you roll). There are also the obvious technology limitations, we can upgrade to the best there is and it might not still be 100% enough. Some Big Data providers have tried their own proprietary ideas to try and get around this issue, or at least lessen it to some degree.

Here are some ways and techniques that are being used in the industry right now:

  1. We have the data compression and de-duplication techniques to make data transfers faster. That’s the “grinding the elephant and pushing it through the straw as fast as possible” solution.
  2. There is the “tinker with current protocols” direction by combining the reliability of TCP connections and the speed and bandwidth of UDP transfers into something that they call FASP. This ensures that communication is fast and secure while doing away with various handshaking processes that TCP requires.
  3. We can also work with various protocol optimizations in order to get around the problem. But one way that is really worth mentioning is the tried and tested transfer method –the old SneakerNet approach. Providers that use this method allow their customers to mail their hard drives to the company address so that they can transfer the data and then mail the hard drives back. This method is often faster at moving extremely large amounts of data quickly even taking into consideration the delivery time.

By Abdul Salam

(Image Source: ShutterStock)

Abdul

Abdul Salam is IT professional and an accomplished technical writer with CloudTweaks. He earned his undergraduate degree in Information Technology followed by a postgraduate degree in Business Informatics. Abdul possess over 3 years’ experience in technical & business writing with deep knowledge in Cloud Computing, VMware,Oracle, Oracle ERP, Cloud ERP, Microsoft Technologies and Network Communications (Cisco, Juniper). Visit his LinkedIn profile at: http://linkd.in/TtFu7X
FacebookTwitterLinkedInGoogle+Share

Sorry, comments are closed for this post.

Join Our Newsletter

Receive updates each week on news, tips, events, comics and much more...

Popular

Top Viral Impact

5 Ways The Internet of Things Will Drive Cloud Growth

5 Ways The Internet of Things Will Drive Cloud Growth

5 Ways The Internet of Things Will Drive Cloud Growth The Internet of Things is the latest term to describe the interconnectivity of all our devices and home appliances. The goal of the internet of things is to create universal applications that are connected to all of the lights, TVs, door locks, air conditioning, and

Using Big Data To Make Cities Smarter

Using Big Data To Make Cities Smarter

Using Big Data To Make Cities Smarter The city of the future is impeccably documented. Sensors are used to measure air quality, traffic patterns, and crowd movement. Emerging neighborhoods are quickly recognized, public safety threats are found via social networks, and emergencies are dealt with quicklier. Crowdsourcing reduces commuting times, provides people with better transportation

Can I Contribute To CloudTweaks?

Yes, much of our focus in 2015 will be on working with other influencers in a collaborative manner. If you're a technology influencer looking to collaborate long term with CloudTweaks – a globally recognized leader in cloud computing information – drop us an email with “tech influencer” in the subject line.

Please review the guidelines before applying.

Whitepapers

Top Research Assets

HP OpenStack® Technology Breaking the Enterprise Barrier

HP OpenStack® Technology Breaking the Enterprise Barrier

Explore how cloud computing is a solution to the problems facing data centers today and highlights the cutting-edge technology (including OpenStack cloud computing) that HP is bringing to the current stage. If you are a CTO, data center administrator, systems architect, or an IT professional looking for an enterprise-grade, hybrid delivery cloud computing solution that’s open,

Public Cloud Flexibility, Private Cloud Security

Public Cloud Flexibility, Private Cloud Security

Public Cloud Flexibility, Private Cloud Security Cloud applications are a priority for every business – the technology is flexible, easy-to-use, and offers compelling economic benefits to the enterprise. The challenge is that cloud applications increase the potential for corporate data to leak, raising compliance and security concerns for IT. A primary security concern facing organizations moving

Hewlett-Packard Company On-Demand Webinar

Hewlett-Packard Company On-Demand Webinar

Shifting Workloads and the Server Evolution Learn more about the latest industry trends and the challenges customers are talking about. Every ten to fifteen years, the types of workloads servers host swiftly shift. This happened with the first single-mission mainframes and today, as disruptive technologies appear in the form of big data, cloud, mobility and