Cloudera Not Cutting It With Big Data Security

Cloudera Not Cutting It With Big Data Security 

Cloudera is, for the moment, a dominating presence in the open source Hadoop landscape; but does it have staying power? While Cloudera’s Big Data platform is the darling of the Hadoop space, they and their open source distribution competitors have so far failed to adequately address the elephant in the room: enterprise data security.

Cloudera’s Chief Architect and creator of Hadoop, Doug Cutting, recently discussed the growing value of Big Data in a CNBC Squawk Box segment, but nervously glossed over the subject of data security when it was raised. Benzinga reported Cutting as saying that, “…the value of Cloudera outweighs most security concerns,” thereby demonstrating a level of hubris and naivety that should put every IT security professional on high alert.  Their dismissive approach to Big Data security should really come as no surprise. Hadoop was not written with security in mind, and to date, the open source Hadoop community, including Cloudera, has not focused on addressing this critical gap.  For enterprise organizations with data at risk, especially those companies that must adhere to regulatory compliance mandates, this should be cause for concern.

Hadoop was a spin-off sub-project of Apache Lucene and Nutch projects, which are based on a MapReduce framework and a distributed file system. That initial application, web indexing, did not require any integrated security.  Hadoop is also the open-source version of the Google MapReduce framework, and the data being stored (public URLs) was not subject to privacy regulation. The open source Hadoop community supports some security features through the current implementation of Kerberos, the use of firewalls, and basic HDFS permissions.  However, Kerberos is difficult to install, configure, and integrate with Active Directory (AD) and Lightweight Directory Access Protocol, (LDAP) services.  Even with special network configuration, a firewall has limited effectiveness, can only restrict access on an IP/port basis, and knows nothing of the Hadoop File System or Hadoop itself.

Enterprises want the same security capabilities for Big Data as they have now for “non-Big Data” information systems, including solutions that address user authentication, access control, policy enforcement, and encryption.  Many organizations require these Big Data safeguards in order to maintain regulatory compliance with HIPAA, HITECH, SOX, PCI/DSS, and other security and privacy mandates.  But they won’t find those safeguards in open source Hadoop distributions today.  Community initiatives underway such as Knox and Rhino are intended to improve Hadoop’s security posture, but tangible results will take time and will certainly lag behind more aggressive commercial efforts.

Cloudera and other distribution vendors are essentially branding open source Hadoop, along with its inherent security limitations.  While Cloudera is perceived as a software company, in reality the vast majority of its revenue is derived from professional services, training, and support.  It’s unlikely that Cloudera will suddenly invert its business model and come to the rescue with an integrated software solution for data security.  Does this mean that Cloudera and other open source Hadoop solutions are dangerous to deploy?  Only if IT organizations ignore the inherent security gaps and risks involved, and do not take adequate precautions to secure the data store.

The recent $45 million cybercrime heist involving ATM machines in New York and around the world is a perfect example of how unauthorized access to a compromised data store can result in tremendous financial loss to the victimized financial institution.  And, by the way, ATM transaction records are exactly the kind of unstructured Big Data that ends up being stored in a Hadoop environment.

For organizations needing robust Big Data security now, Orchestrator, a commercial software solution from Zettaset, provides enterprise-class security that is embedded in the Big Data cluster itself, moving security as close as possible to the data, and providing protection that perimeter security devices such as firewalls simply cannot deliver.   Zettaset’s Orchestrator software automates cluster management and security, and works in conjunction with most Hadoop distributions, including Cloudera’s, to address open source vulnerabilities in datacenter environments where security and compliance is a business imperative.

While open source Hadoop solutions such as Cloudera’s do indeed have value, make no mistake: The security demands of today’s at-risk enterprises clearly represent a much higher priority for IT professionals and the organizations they serve.

By Jim Vogt /  Zettaset CEO

With more than 25 years of leadership experience in both start-up and established corporations, Jim Vogt brings a wealth of business and technology expertise to his role as president and CEO of Zettaset. Most recently, Jim served as senior vice president and general manager of the Cloud Services business unit at Blue Coat Systems. Prior to Blue Coat, he served as president and CEO at Trapeze Networks, which was acquired by Belden, Inc. He was also president and CEO at data encryption start-up Ingrian Networks (acquired in April, 2008 by SafeNet).

About CloudTweaks

Established in 2009, CloudTweaks is recognized as one of the leading authorities in connected technology information and services.

We embrace and instill thought leadership insights, relevant and timely news related stories, unbiased benchmark reporting as well as offer green/cleantech learning and consultive services around the world.

Our vision is to create awareness and to help find innovative ways to connect our planet in a positive eco-friendly manner.

In the meantime, you may connect with CloudTweaks by following and sharing our resources.

View All Articles

3 Responses to Cloudera Not Cutting It With Big Data Security

  1. I completely agree that statements like  “…the value of Cloudera outweighs most security concerns,” should be cause for alarm. However, security belongs right-up front in any deployment or integration project as a key part of the design phase. Adding security after deployment often leads to compromises or overlooked gaps. Having the security hooks baked in from the beginning is often a great approach, and I look forward to seeing how Zettaset evolves in relation to big data security.

  2. Nice post, Jim. However, I disagree with the fundamental premise that Cloudera is not doing enough to address Hadoop security. In fact, I think they’re taking the right approach by tackling security through their partner ecosystem vs. bolting on heavy-duty security to Hadoop themselves. This allows Cloudera to focus more on their core business of providing customers with a better, faster, stronger, more enterprise-ready big data systems, while leaving the all-important job of data security to the experts in their respective fields. I wrote a blog on this topic, which you can find here: http://www.gazzang.com/blog.

  3. I’m pleased to see that this blog has generated some good discussion.   I want to point out that my original criticism of Cloudera arose from their executive management’s surprising assertion that the value of their Hadoop distribution outweighs the overall need for data security in the enterprise.  I think everyone who understands enterprise data security, especially IT professionals who are responsible for risk management, will agree with me.  Despite what Cloudera management says, the need for security in the Big Data environment is paramount.  The awareness of that need is apparent to the enterprises and partners that we work with, who clearly view Hadoop’s security limitations as a barrier to broader adoption, and are seeking a strong foundational approach to Hadoop cluster security.   The “build today, secure tomorrow” approach won’t work with organizations in regulated industries.  Enterprises that deal with data-of-consequence such as individual financial, health, or retail transaction records, understand the serious nature of a datastore breach and its potential impact on the business, and are looking for security solutions that extend beyond the open source capabilities.  In order for Hadoop adoption to accelerate, it is important that open source Hadoop distribution vendors openly acknowledge the security gap to the market, deal with the issue transparently, and not dismiss security as an after-thought.  Although there are multiple projects addressing this in the open source community, that will take some time.  Zettaset has published a http://www.zettaset.com/info-center/datasheets/zettaset_wp_security_0413.pdf that addresses this issue in more detail, and we are accelerating  the adoption of Hadoop for enterprises with a secure solution which complements the open source today.

Comic
In The Fast Lane: Connected Car Hacking A Big Risk

In The Fast Lane: Connected Car Hacking A Big Risk

Connected Car Hacking Researchers and cybersecurity experts working hard to keep hackers out of the driver’s seat. Modern transportation has come a million miles, and most all of today’s vehicles are controlled entirely by digital technology. Millions of drivers are not aware that of the many devices in their digital arsenal, the most complex of…

Having Your Cybersecurity And Eating It Too

Having Your Cybersecurity And Eating It Too

The Catch 22 The very same year Marc Andreessen famously said that software was eating the world, the Chief Information Officer of the United States was announcing a major Cloud First goal. That was 2011. Five years later, as both the private and public sectors continue to adopt cloud-based software services, we’re interested in this…

Building a Data Security Strategy – More Important Than Ever

Building a Data Security Strategy – More Important Than Ever

Data Security Strategy Article sponsored by SAS Software and Big Data Forum Security and privacy have been an integral concern of the IT industry since its very inception, but as it expands through web-based, mobile, and cloud-based applications, access to data is magnified as are the threats of illicit penetration. As enterprises manage vast quantities…

Pitney Bowes Selects Aria Systems for Billing on the New Commerce Cloud

Pitney Bowes Selects Aria Systems for Billing on the New Commerce Cloud

Top-Ranked Cloud Billing Company Enables Greater Speed and Frictionless Billing for Unparalleled Customer Experience San Francisco, CA – August 23, 2016 – Aria Systems, which helps enterprises grow subscription and usage-based revenue, today announced that Pitney Bowes has selected Aria’s cloud-based monetization platform as the key billing and monetization component of their new Commerce Cloud…

The Golden Age of Wearable Technology

The Golden Age of Wearable Technology

The Golden Age One of the biggest fads in the technology sector right now is wearable tech. From Smartwatches that let you check your emails, chat with friends and search the web, to fitness accessories that monitor your heart rate and your sleep patterns, this is truly the Golden Age of wearable technology. But some…

Rounding Out Your Security Strategy With The Enterprise Cloud

Rounding Out Your Security Strategy With The Enterprise Cloud

Enterprise Cloud Security Strategy No company wants to be the one to have to announce one of the world’s biggest data breaches. From managing networks and datacenters to protecting hundreds of applications, today’s enterprises face enormous challenges. With all the surface area that needs to be tracked and protected including APIs on the front-end, customer integrations…

Multi-Cloud Integration Has Arrived

Multi-Cloud Integration Has Arrived

Multi-Cloud Integration Speed, flexibility, and innovation require multiple cloud services As businesses seek new paths to innovation, racing to market with new features and products, cloud services continue to grow in popularity. According to Gartner, 88% of total compute will be cloud-based by 2020, leaving just 12% on premise. Flexibility remains a key consideration, and…

The Cloud Is Not Enough! Why Businesses Need Hybrid Solutions

The Cloud Is Not Enough! Why Businesses Need Hybrid Solutions

Why Businesses Need Hybrid Solutions Running a cloud server is no longer the novel trend it once was. Now, the cloud is a necessary data tier that allows employees to access vital company data and maintain productivity from anywhere in the world. But it isn’t a perfect system — security and performance issues can quickly…

5 Ways To Ensure Your Cloud Solution Is Always Operational

5 Ways To Ensure Your Cloud Solution Is Always Operational

Ensure Your Cloud Is Always Operational We have become so accustomed to being online that we take for granted the technological advances that enable us to have instant access to everything and anything on the internet, wherever we are. In fact, it would likely be a little disconcerting if we really mapped out all that…

Using Cloud Technology In The Education Industry

Using Cloud Technology In The Education Industry

Education Tech and the Cloud Arguably one of society’s most important functions, teaching can still seem antiquated at times. Many schools still function similarly to how they did five or 10 years ago, which is surprising considering the amount of technical innovation we’ve seen in the past decade. Education is an industry ripe for innovation…

Cloud Computing – The Real Story Is About Business Strategy, Not Technology

Cloud Computing – The Real Story Is About Business Strategy, Not Technology

Enabling Business Strategies The cloud is not really the final destination: It’s mid-2015, and it’s clear that the cloud paradigm is here to stay. Its services are growing exponentially and, at this time, it’s a fluid model with no steady state on the horizon. As such, adopting cloud computing has been surprisingly slow and seen more…

How Data Science And Machine Learning Is Enabling Cloud Threat Protection

How Data Science And Machine Learning Is Enabling Cloud Threat Protection

Data Science and Machine Learning Security breaches have been consistently rising in the past few years. Just In 2015, companies detected 38 percent more security breaches than in the previous year, according to PwC’s Global State of Information Security Survey 2016. Those breaches are a major expense — an average of $3.79 million per company,…

Why Small Businesses Need A Business Intelligence Dashboard

Why Small Businesses Need A Business Intelligence Dashboard

The Business Intelligence Dashboard As a small business owner you would certainly know the importance of collecting and analyzing data pertaining to your business and transactions. Business Intelligence dashboards allow not only experts but you also to access information generated by analysis of data through a convenient display. Anyone in the company can have access…

5 Surprising Ways Cloud Computing Is Changing Education

5 Surprising Ways Cloud Computing Is Changing Education

Cloud Computing Education The benefits of cloud computing are being recognized in businesses and institutions across the board, with almost 90 percent of organizations currently using some kind of cloud-based application. The immediate benefits of cloud computing are obvious: cloud-based applications reduce infrastructure and IT costs, increase accessibility, enable collaboration, and allow organizations more flexibility…

SaaS And The Cloud Are Still Going Strong

SaaS And The Cloud Are Still Going Strong

SaaS And The Cloud With the results of Cisco Global Could Index: 2013-2018 and Hosting and Cloud Study 2014, predictions for the future of cloud computing are notable. Forbes reported that spending on infrastructure-related services has increased as public cloud computing uptake spreads, and reflected on Gartner’s Public Cloud Services Forecast. The public cloud service…

The Internet of Things – Redefining The Digital World As We Know It

The Internet of Things – Redefining The Digital World As We Know It

Redefining The Digital World According to Internet World Stats (June 30th, 2015), no fewer than 3.2 billion people across the world now use the internet in one way or another. This means an incredible amount of data sharing through the utilization of API’s, Cloud platforms and inevitably the world of connected Things. The Internet of Things is a…