Adopting Big Data Analytics
- Big data adoption reached 53% in 2017 for all companies interviewed, up from 17% in 2015, with telecom and financial services leading early adopters.
- Reporting, dashboards, advanced visualization end-user “self-service” and data warehousing are the top five technologies and initiatives strategic to business intelligence.
- Data warehouse optimization remains the top use case for big data, followed by customer/social analysis and predictive maintenance.
- Among big data distributions, Cloudera is the most popular, followed by Hortonworks, MAP/R, and Amazon EMR.
These and many other insights are from Dresner Advisory Services’ insightful 2017 Big Data Analytics Market Study (94 pp., PDF, client accessed reqd), which is part of their Wisdom of Crowds® series of research. This 3rd annual report examines end-user trends and intentions surrounding big data analytics, defined as systems that enable end-user access to and analysis of data contained and managed within the Hadoop ecosystem. The 2017 Big Data Analytics Market Study represents a cross-section of data that spans geographies, functions, organization size, and vertical industries. Please see page 10 of the study for additional details regarding the methodology.
“Across the three years of our comprehensive study of big data analytics, we see a significant increase in uptake in usage and a large drop of those with no plans to adopt,” said Howard Dresner, founder and chief research officer at Dresner Advisory Services. “In 2017, IT has emerged as the most typical adopter of big data, although all departments – including finance – are considering future use. This is an indication that big data is becoming less an experimental endeavor and more of a practical pursuit within organizations.”
Key takeaways include the following:
- Reporting, dashboards, advanced visualization end-user “self-service” and data warehousing are the top five technologies and initiatives strategic to business intelligence. Big Data ranks 20th across 33 key technologies Dresner Advisory Services currently tracks. Big Data Analytics is of greater strategic importance than the Internet of Things (IoT), natural language analytics, cognitive Business Intelligence (BI) and Location intelligence.
- 53% of companies are using big data analytics today, up from 17% in 2015 with Telecom and Financial Services industries fueling the fastest adoption. Telecom and financial services are the most active early adopters, with Technology and Healthcare being the third and fourth industries seeing big data analytics Education has the lowest adoption as 2017 comes to a close, with the majority of institutions in that vertical saying they are evaluating big data analytics for the future. North America (55%) narrowly leads EMEA (53%) in their current levels of big data analytics adoption. Asia-Pacific respondents report 44% current adoption and are most likely to say they “may use big data in the future.”
- Data warehouse optimization is considered the most important big data analytics use case in 2017, followed by customer/social analysis and predictive maintenance. Data warehouse optimization is considered critical or very important by 70% of all respondents. It’s interesting to note and ironic that the Internet of Things (IoT) is among the lowest priority use cases for big data analytics today.
- Big data analytics use cases vary significantly by industry with data warehouse optimization dominating Financial Services, Healthcare, and Customer/social analysis is the leading use case in Technology-based companies. Fraud detection use cases also dominate Financial Services and Telecommunications. Using big data for clickstream analytics is most popular in Financial Services.
- Spark, MapReduce, and Yarn are the three most popular software frameworks today. Over 30% of respondents consider Spark critical to their big data analytics strategies. MapReduce and Yarn are “critical” to more than 20 percent of respondents.
- The big data access methods most preferred by respondents include Spark SQL, Hive, HDFS and Amazon S3. 73% of the respondents consider Spark SQL critical to their analytics strategies. Over 30% of respondents consider Hive and HDFS critical as well. Amazon S3 is critical to one of five respondents for managing big data access. The following graphic shows the distribution of big data access methods.
- Machine learning continues to gain more industry support and investment plans with Spark Machine Learning Library (MLib) adoption projected to grow by 60% in the next 12 months. In the next 24 months, MLib will dominate machine learning according to the survey results. MLib is accessible from the Sparklyr R Package and many others, which continues to fuel its growth. The following graphic compares projected two-year adoption rates by machine learning libraries and frameworks.
By Louis Columbus