Modernize and Future-Proof Your Data Analytics Environment

Future-Proof Your Data Analytics Environment

More than ever, we are seeing companies use data to make business decisions in real-time. This ubiquitous access makes it imperative for organizations to move beyond legacy architectures that can’t handle their workloads.

Ronald van Loon is an HPE partner and spoke with Matt Maccaux recently. Matt is the global field CTO of the Ezmeral Enterprise Software BU at Hewlett-Packard Enterprise, who provided meaningful insights on the challenges of moving to a cloud-native analytics environment as well as potential steps that companies can take to make this transition along with some key technology trends.

“It’s not trivial, it is not a simple process because these data-intensive applications don’t tend to work in those cloud-native environments,” Matt says about companies moving their advanced analytics infrastructure to the cloud. This increased need for instant access to data, the high velocity of new information, and low tolerance for latency has forced companies of all sizes to reevaluate how they build their IT infrastructure.

The Challenges of Supporting Real-Time Analytics

Data volumes have increased exponentially, with more than 90% of the data in the world today having been created in the past two years alone. In 2020, 64.2 zettabytes of data was generated or replicated, and this growth is attributed to the amount of people learning, training, interacting, working, and entertaining themselves from their homes. Most companies do not store all of their raw data indefinitely – so how can they analyze it to deliver business insights? Analyzing high velocity, big data streams using traditional data warehousing and analytics tools has proven to be challenging.

To analyze data at the speed of business, companies need real-time analytics solutions that can ingest large volumes of data in motion as it is constantly generated by devices, sensors, applications and machines. In addition to processing data in real-time (also known as “streaming”), the solution must be able to capture and store data when it is not in motion for analytics on “batch” data.

This presents a significant challenge because most existing data warehousing and business intelligence tools were designed primarily for analysis of historical, stored data, and are typically not optimized for low-latency access to streaming data.

Transitioning to a Cloud-Native Environment

The reason it’s particularly challenging for companies to shift from an on-premises environment to a cloud-native environment is scale. The vast majority of companies have invested heavily in on-premises hardware, software and skills over the years, but they must now overhaul their IT infrastructure to deal with workloads that simply could not be handled when those investments were made.

In addition, although today’s data volumes are massive, they will be dwarfed by the data created when the Internet of Things (IoT), 5G and other major technology shifts take hold.

Making Big Changes with Small Steps

As a result, it makes sense to start building an architecture that will support your workloads—whether or not they are currently being processed in the cloud—rather than start from scratch. This is where small steps come into play: start with a data warehouse in the cloud, and then add real-time analytics capabilities on top of it.

Many companies are already making this transition, but they are moving at an agonizingly slow pace because of the massive challenge such a change presents.

Separating Compute and Storage

Separating compute and storage in a cloud-environment can result in a cloud-native data analytics platform that can perform real-time and near real-time analysis on both streaming and stored data while also enabling different teams to have access to their own raw data at any time. The compute, storage, security and networking functions of the on-premises environment are encapsulated by an elastic container running in the cloud, while an intelligent gateway with built-in algorithms ingests each dataset into the cloud and exposes it to users for analysis.

The combination of a modern data warehouse architecture (either in the cloud or on-premises) and real-time analytics enables low-latency access to your data from nearly any device or location. It also allows you to start analyzing your data in near real-time and store it for future analysis, be it batch or offline analytics.

Cloud native compute containers

Containers are a key part of cloud-native architectures because they enable the rapid deployment of applications without requiring installation, configuration and ongoing maintenance of an operating system.

Deploying containers in production

Once a data analytics workload has been migrated to the cloud, you can start deploying containers for that workload. The container should be tied to your data and placed in such a way that the compute resources are elastic (meaning additional resources can be added or removed) and easily configurable.

In addition, running the compute resources in private containers so that they are protected from other workloads is recommended and you can manage them as independent services.

Managing containers

If you deploy your analytics workloads inside containers, you need to manage them. It is possible to use the same container management tools that are used for managing traditional applications to manage cloud-native assets, but it requires a different way of thinking about how they are deployed and managed.

A major advantage of using containers is that they are run in isolation, but this advantage is only fully realized if you ensure that the containers are managed with granular resource and service-level policies. This requires tighter integration between container management tools and cloud orchestration tools to enable dynamic scaling of compute resources for each workload based on demand.

The ability to reallocate resources from one workload to another as needed is particularly important in a multi-tenant environment, since you will want to avoid collocation of workloads and resource constraints.

Key Technology Trends In Modernizing Data Analytics Environments

To handle data-intensive workloads, companies are turning to open-source runtimes of Kubernetes as well as open-source runtimes of Apache Spark. They are also increasingly using container platforms, such as Docker and Kubernetes, to remove the friction of packaging applications for deployment. With recent advances in hybrid cloud, object storage, elastic compute and serverless architectures, customers are now taking advantage of these state-of-the-art technologies to modernize their data analytics environments.

  • Deploying cloud native data warehouses

Accelerating the design, build and deployment of a data warehouse have been made possible by new tools built to move a company’s on-premises data warehouse to the cloud. Furthermore, companies are taking advantage of these same state-of-the-art technologies to modernize their data analytics environments.

  • Data analytics on an open platform

For the first time, modernized data analytics architectures can be easily extended and managed in a cloud-native environment. This means that organizations no longer need to choose between legacy proprietary hardware and software or building their own in-house infrastructure. Providers are also taking advantage of these technologies to deploy big data solutions that are cloud native in nature. This means they can be deployed on-premises, or as a service using public clouds for the highest security and reliability.

  • Hybrid cloud and multi-cloud infrastructure

With the rise of hybrid cloud, companies are deploying both on premises and in public clouds. For example, some workloads can be deployed to a private cloud for higher security requirements or performance-sensitive workloads that require a customized environment with more processing power. Cloud-native technologies like Kubernetes, Docker and Apache Spark can help move these workloads to the cloud.

Creating a Future-Proof Advanced Analytics Environment

A modern data analytics environment leverages an elastic container running in a private cloud to encapsulate compute, storage, networking and security functions of a data warehouse architecture. This results in more agile development and testing cycles as well as faster time-to-production when compared with traditional approaches.

By Ronald van Loon

Juan Pablo Perez Etchegoyen

69% of Enterprises are Moving Mission-Critical Information to the Cloud

Why Security matters According to a research study by the Cloud Security Alliance (CSA), 69% of enterprises are moving mission-critical information to the cloud. These migrations are massively complex and take meticulous planning to ensure ...
Derrek Schutman

Implementing Digital Capabilities Successfully to Boost NPS and Maximize Value Realization

Implementing Digital Capabilities Successfully Building robust digital capabilities can deliver huge benefits to Digital Service Providers (DSPs). A recent TMForum survey shows that building digital capabilities (including digitization of customer experience and operations), is the ...
Gary Bernstein

5 Popular Telemedicine Software Services

Telemedicine Software Since the beginning of the Covid-19 pandemic, telemedicine software services have become extremely popular, and every day more people are using this service instead of going to hospitals and emergency departments as they ...
Gary Bernstein

Exposed Data From 21 Million VPN Mobile Users

Exposed Data From 21 Million VPN Mobile Users The data and credentials from 21 million mobile VPN users were found for sale last week in an internet forum. A cyber thief posted the credentials for ...
Mining Data

Cloud Mining and the GPU Shortage

Cloud Mining Cryptocurrency seemed to take a jump this year to a new level of internet hype. Bitcoin hit $60,000 and Elon Musk’s tweeting about Dogecoin made millionaires out of memelords. Alongside this new wave ...

PROXY SERVICES

The CloudTweaks technology lists will include updated resources to leading services from around the globe. Examples include leading IT Monitoring Services, Bootcamps, VPNs, CDNs, Reseller Programs and much more...

  • Smartproxy

    Smartproxy

    Smartproxy is a rising star in the constantly growing proxy market. Smartproxy offers awarded customer service, impressive performance, and is serious about your anonymity (yes, cybersecurity matters). The latest features developed by Smartproxy are 30 minute long sticky sessions and Google Proxies. Rumor has it, the latter guarantee 100% success rate

  • Bright Data

    Bright Data

    Bright Data’s network is one of the most robust of its kind globally. Here are its stark advantages: Extremely stable connection for long sessions (99.99% uptime guaranteed). Free to integrate with our Proxy Manager which allows you to define custom rules for optimized results. Send unlimited concurrent requests increasing speed, cost-effectiveness, and overall efficiency.

  • Rsocks

    Rsocks

    RSocks team offers a huge amount of residential plans which were developed for plenty of tasks and, most importantly, has been proved to be quite efficient. Such variety has been created on purpose to let everyone choose a plan for a reasonable price, online, rotation and other parameters.

  • Storm Proxies

    Storm Proxies

    Storm Proxies' network is optimized for high performance and fast multi-threaded tools. You get unlimited bandwidth. No hidden costs, no limits on bandwidth. Try Storm Proxies 100% Risk Free. If you are not happy with the service email us within 24 hours of purchase and we will refund you.