Cloudera Not Cutting It With Big Data Security

Cloudera Not Cutting It With Big Data Security 

Cloudera is, for the moment, a dominating presence in the open source Hadoop landscape; but does it have staying power? While Cloudera’s Big Data platform is the darling of the Hadoop space, they and their open source distribution competitors have so far failed to adequately address the elephant in the room: enterprise data security.

Cloudera’s Chief Architect and creator of Hadoop, Doug Cutting, recently discussed the growing value of Big Data in a CNBC Squawk Box segment, but nervously glossed over the subject of data security when it was raised. Benzinga reported Cutting as saying that, “…the value of Cloudera outweighs most security concerns,” thereby demonstrating a level of hubris and naivety that should put every IT security professional on high alert.  Their dismissive approach to Big Data security should really come as no surprise. Hadoop was not written with security in mind, and to date, the open source Hadoop community, including Cloudera, has not focused on addressing this critical gap.  For enterprise organizations with data at risk, especially those companies that must adhere to regulatory compliance mandates, this should be cause for concern.

Hadoop was a spin-off sub-project of Apache Lucene and Nutch projects, which are based on a MapReduce framework and a distributed file system. That initial application, web indexing, did not require any integrated security.  Hadoop is also the open-source version of the Google MapReduce framework, and the data being stored (public URLs) was not subject to privacy regulation. The open source Hadoop community supports some security features through the current implementation of Kerberos, the use of firewalls, and basic HDFS permissions.  However, Kerberos is difficult to install, configure, and integrate with Active Directory (AD) and Lightweight Directory Access Protocol, (LDAP) services.  Even with special network configuration, a firewall has limited effectiveness, can only restrict access on an IP/port basis, and knows nothing of the Hadoop File System or Hadoop itself.

Enterprises want the same security capabilities for Big Data as they have now for “non-Big Data” information systems, including solutions that address user authentication, access control, policy enforcement, and encryption.  Many organizations require these Big Data safeguards in order to maintain regulatory compliance with HIPAA, HITECH, SOX, PCI/DSS, and other security and privacy mandates.  But they won’t find those safeguards in open source Hadoop distributions today.  Community initiatives underway such as Knox and Rhino are intended to improve Hadoop’s security posture, but tangible results will take time and will certainly lag behind more aggressive commercial efforts.

Cloudera and other distribution vendors are essentially branding open source Hadoop, along with its inherent security limitations.  While Cloudera is perceived as a software company, in reality the vast majority of its revenue is derived from professional services, training, and support.  It’s unlikely that Cloudera will suddenly invert its business model and come to the rescue with an integrated software solution for data security.  Does this mean that Cloudera and other open source Hadoop solutions are dangerous to deploy?  Only if IT organizations ignore the inherent security gaps and risks involved, and do not take adequate precautions to secure the data store.

The recent $45 million cybercrime heist involving ATM machines in New York and around the world is a perfect example of how unauthorized access to a compromised data store can result in tremendous financial loss to the victimized financial institution.  And, by the way, ATM transaction records are exactly the kind of unstructured Big Data that ends up being stored in a Hadoop environment.

For organizations needing robust Big Data security now, Orchestrator, a commercial software solution from Zettaset, provides enterprise-class security that is embedded in the Big Data cluster itself, moving security as close as possible to the data, and providing protection that perimeter security devices such as firewalls simply cannot deliver.   Zettaset’s Orchestrator software automates cluster management and security, and works in conjunction with most Hadoop distributions, including Cloudera’s, to address open source vulnerabilities in datacenter environments where security and compliance is a business imperative.

While open source Hadoop solutions such as Cloudera’s do indeed have value, make no mistake: The security demands of today’s at-risk enterprises clearly represent a much higher priority for IT professionals and the organizations they serve.

By Jim Vogt /  Zettaset CEO

With more than 25 years of leadership experience in both start-up and established corporations, Jim Vogt brings a wealth of business and technology expertise to his role as president and CEO of Zettaset. Most recently, Jim served as senior vice president and general manager of the Cloud Services business unit at Blue Coat Systems. Prior to Blue Coat, he served as president and CEO at Trapeze Networks, which was acquired by Belden, Inc. He was also president and CEO at data encryption start-up Ingrian Networks (acquired in April, 2008 by SafeNet).

Follow Us!

CloudTweaks

Established in 2009, CloudTweaks.com is recognized as one of the leading authorities in cloud computing information. Most of the excellent CloudTweaks articles are provided by our own paid writers, with a small percentage provided by guest authors from around the globe, including CEOs, CIOs, Technology bloggers and Cloud enthusiasts. Our goal is to continue to build a growing community offering the best in-depth articles, interviews, event listings, whitepapers, infographics and much more...
Follow Us!

4 Responses to Cloudera Not Cutting It With Big Data Security

  1. I completely agree that statements like  “…the value of Cloudera outweighs most security concerns,” should be cause for alarm. However, security belongs right-up front in any deployment or integration project as a key part of the design phase. Adding security after deployment often leads to compromises or overlooked gaps. Having the security hooks baked in from the beginning is often a great approach, and I look forward to seeing how Zettaset evolves in relation to big data security.

  2. Nice post, Jim. However, I disagree with the fundamental premise that Cloudera is not doing enough to address Hadoop security. In fact, I think they’re taking the right approach by tackling security through their partner ecosystem vs. bolting on heavy-duty security to Hadoop themselves. This allows Cloudera to focus more on their core business of providing customers with a better, faster, stronger, more enterprise-ready big data systems, while leaving the all-important job of data security to the experts in their respective fields. I wrote a blog on this topic, which you can find here: http://www.gazzang.com/blog.

  3. From
    a quick online study, Cloudera appears to have as much security as other Hadoop
    platform providers, such as Hortonworks, and others.  Cloudera’s Security
    Guide is a healthy 119 pages.  That is not that bad, given the current
    state of security.  How much security is there in the mobile world? 
    In the Cloud?  I hate to say this but: “Technology is no place for wimps
    and big enterprises don’t think of security as optional.”
     Not
    sure it makes sense to pick on the open source folks, either.Cloudera’s business model relies on partners
    who already have appropriate enterprise-grade specialties.  Hence, Gazzang
    has a legitimate opportunity to supply valuable services.  Furthermore,
    since Hadoop is an open platform, if (and when) there were a real, true need
    for more security, there are ways that public input into the development
    process can propose new features. It WILL happen in some form some day.
     But for the time being, if we dig deeper, it seems that part
    of the issue is not in Hadoop or even HDFS.The Hadoop Distributed File System has such a large block size, that
    trying to secure that large of a chunk size could be inefficient from the
    beginning.Perhaps a better security
    approach is Attribute-Based Access Control.This approach relies on metadata.Hadoop and HDFS are relatively weak in that department.To me, that is a more fundamental issue than
    security.That is one reason that drives
    many Big Data implementations to add a Big Data database such as MongoDB,
    HBase, Cassandra, DynamoDB, etc.It is
    more likely that the database will be a level at which solid security may be
    implemented.
     If
    an implementation has gone down this Big Data path and cannot secure their
    goods due to performance challenges, then I would suggest checking out http://www.velocidata.com/’s engineered solutions for security.When slow performance is keeping you from
    selecting secure implementations, http://velocidata.com/products/security-appliances/ can help overcome roadblocks.  Their
    purpose-built systems are benchmarked to perform full AES encryption at more
    than 2 GBytes / s *sustained* throughput for bulk/block encryption and
    NIST-proposed format preserving mode at more than 10 million fields / s for
    column-level encryption.  Extremely fast key changes are possible with
    their offerings, too.That is more than
    enough horse power to effectively secure Big Data.

  4. I’m pleased to see that this blog has generated some good discussion.   I want to point out that my original criticism of Cloudera arose from their executive management’s surprising assertion that the value of their Hadoop distribution outweighs the overall need for data security in the enterprise.  I think everyone who understands enterprise data security, especially IT professionals who are responsible for risk management, will agree with me.  Despite what Cloudera management says, the need for security in the Big Data environment is paramount.  The awareness of that need is apparent to the enterprises and partners that we work with, who clearly view Hadoop’s security limitations as a barrier to broader adoption, and are seeking a strong foundational approach to Hadoop cluster security.   The “build today, secure tomorrow” approach won’t work with organizations in regulated industries.  Enterprises that deal with data-of-consequence such as individual financial, health, or retail transaction records, understand the serious nature of a datastore breach and its potential impact on the business, and are looking for security solutions that extend beyond the open source capabilities.  In order for Hadoop adoption to accelerate, it is important that open source Hadoop distribution vendors openly acknowledge the security gap to the market, deal with the issue transparently, and not dismiss security as an after-thought.  Although there are multiple projects addressing this in the open source community, that will take some time.  Zettaset has published a http://www.zettaset.com/info-center/datasheets/zettaset_wp_security_0413.pdf that addresses this issue in more detail, and we are accelerating  the adoption of Hadoop for enterprises with a secure solution which complements the open source today.

Join Our Newsletter

Receive updates each week on news, tips, events, comics and much more...

Can I Contribute To CloudTweaks?

Yes, much of our focus in 2015 will be on working with other influencers in a collaborative manner. If you're a technology influencer looking to collaborate long term with CloudTweaks – a globally recognized leader in cloud computing information – drop us an email with “tech influencer” in the subject line.

Please review the guidelines before applying.

Contributors

Cloud Infographic – Wearable Tech And Preventative Healthcare

Cloud Infographic – Wearable Tech And Preventative Healthcare

Wearable Tech And Preventative Healthcare There are so many exciting new opportunities available to utilize wearable technology in the future.  Areas such as nanotechnology disease monitoring, crowdfunding to wearable accessories are some excellent examples of the potential. Estimates vary, but appear to suggest that the market will produce between $14-50 Billion over the next few years. Included below

Ten Tips For Successful Business Intelligence Implementation

Ten Tips For Successful Business Intelligence Implementation

Ten Tips for Successful Business Intelligence Implementation The cost of Business Intelligence (BI) software goes far beyond the purchase price. Time spent researching, implementing, and maintaining your BI investment can snowball quickly and mistakes are often expensive. Your time is valuable – save it by learning from other businesses’ experiences. We’ve compiled the top ten

Knots And Cloud Service Providers

Knots And Cloud Service Providers

How Do These Two Compare? In Boy Scouts, I learned how to tie knots. The quickest knot you can tie is the slipknot. It’s very effective for connecting one thing to another via the rope you have. It was used in setting up tents, mooring boats to docks temporarily and lifting your food up into

Big Data

To Have and Have Not: Big Data Initiatives In Developing Countries

To Have and Have Not: Big Data Initiatives In Developing Countries

Big Data Initiatives In Developing Countries The poor of the developing countries are becoming increasingly connected, to the point where they too are part of the Big Data revolution that’s happening across the globe. It didn’t come with laptops, though, as some supposed it would. Whereas it costs a fortune to connect broadband to a

Big Data In Your Garden: Initiatives For Better Understanding Nature

Big Data In Your Garden: Initiatives For Better Understanding Nature

Big Data in Your Garden Big Data and IoT initiatives are springing up all across the globe, making cities, protesters–and just about everything else–smarter. However, thus far there’s been little attention paid to the interactions between these bizarre technologies and living things other than humans. Biology, that is, human biology is one field where Big

Who Holds the Key to the City: Big Data and City Management

Who Holds the Key to the City: Big Data and City Management

Big Data and City Management Cities like New York, Madrid, and especially Rio de Janeiro are augmented with Big Data-powered initiatives that range from combating crime with predictive analytics (New York & Madrid) to providing real-time data for improved management. Although Big Data is no panacea and is mainly used in conjunction with a greater

Internet of Things

Where’s the Capital of the Internet of Things?

Where’s the Capital of the Internet of Things?

Where’s the Capital? We all know the capitals of fashion are London, New York and Paris, while the capital of film is Hollywood (or Bollywood!) – but what’s the new capital of the internet? Specifically, the internet of things? The answer – according to new research by Ozy – might surprise you. It’s not Tokyo, Seoul,

Smart Cities – How Big Data Is Changing The Power Grid

Smart Cities – How Big Data Is Changing The Power Grid

Smart Cities And Big Data As Anthony Townsend argues in his SMART CITIES, even though the communications industry has changed beyond recognition since its inception, the way we consume power has remained stubbornly anachronistic. The rules of physics are, of course, partially to blame, for making grid networks harder to decentralize, as opposed to communication

Aggregated News

Popular News Sources

New Funding For Acumatica ERP Cloud Business – $13 Million Invested

New Funding For Acumatica ERP Cloud Business – $13 Million Invested

Acumatica ERP Cloud Business Acumatica, a well known ERP cloud services company has raised over $13 millions in new funding led by Bain Capital Owned-MYOB. This is exciting news for the company and demonstrates the high level of adoption and commitment by their clients and partners. This investment validates the market acceptance of the Acumatica solution,” said

Why Microsoft CEO Satya Nadella Loves What Steve Ballmer Once Despised

Why Microsoft CEO Satya Nadella Loves What Steve Ballmer Once Despised

“I don’t want to fight old battles,” says Microsoft CEO Satya Nadella. “I want to fight new ones.” It’s Sunday evening, and Nadella is sitting in a glass-enclosed room at the back of a Japanese restaurant in San Francisco’s North Beach neighborhood, eating sushi with a few reporters. The post Why Microsoft CEO Satya Nadella Loves

Apple sales soar after record-breaking iPhone 6 and 6 Plus launch

Apple sales soar after record-breaking iPhone 6 and 6 Plus launch

The US tech giant reported a 16 per cent jump in iPhone sales between July and September, and the strongest growth in Mac computer shipments in years. Read the source article at dailymail.co.uk About Latest Posts Follow Us!CloudTweaksEstablished in 2009, CloudTweaks.com is recognized as one of the leading authorities in cloud computing information. Most of