Want to dip your toe into the cloud? Challenges of a Large Migration

Want to dip your toe into the cloud? Challenges of a Large Migration

Challenges of a Large Migration Migrating to the cloud can be a daunting task. First you have to go through an exhaustive process deciding what to migrate, then you have to do plenty of scripting and re-architecting to get all that to cloud, all while
Google Cloud Platform: Enabling APIs

Google Cloud Platform: Enabling APIs

Enabling Google APIs The Google Cloud Platform is a comprehensive tool that helps companies manage their IT resources. Completing software projects, built from scratch, is an incredibly time-consuming and wallet-ending endeavor. The cost of this development is due to the nature of the technology as

CONTRIBUTORS

7 Common Cloud Security Missteps

7 Common Cloud Security Missteps

Cloud Security Missteps Cloud computing remains shrouded in mystery for the average American. The most common sentiment is, “It’s not ...
Technology Influencer in Chief: 5 Steps to Success for Today’s CMOs

Technology Influencer in Chief: 5 Steps to Success for Today’s CMOs

Success for Today’s CMOs Being a CMO is an exhilarating experience - it's a lot like running a triathlon and ...
Cloud Migration – 10 ‘Do it Right’ Tips

Cloud Migration – 10 ‘Do it Right’ Tips

Cloud Migration Tips Businesses continue to adopt the cloud at break neck speed. Inherent benefits like lower operational costs, no ...

RESOURCES

Glassdoor’s 10 Highest Paying Tech Jobs Of 2018

Glassdoor’s 10 Highest Paying Tech Jobs Of 2018

Glassdoor is best known for its candid, honest reviews of employers written anonymously by employees. It is now common practice and a good idea for anyone considering a position with a new employer to check them out on Glassdoor first. With ...
Infographic - Internet of Things (IoT) Will Be Top Technology Investment

Infographic – Internet of Things (IoT) Will Be Top Technology Investment

Internet of Things Investment Investors are jumping all over the opportunities abound when it comes to the Internet of Things and Big Data. There is simply way too much money at stake to ignore the potential that is going to truly ...
Top 10 Machine Learning Algorithms

Top 10 Machine Learning Algorithms to Know

Top 10 Machine Learning Algorithms Modern advancements in Artificial Intelligence (AI) are set to change our world for the better. These developments have largely been made possible due to technologies such as cloud sharing, data analytics, blockchain, and improved computing ...
real time hacking attacks

Live Real Time Hacking and Ransomware Tracking Maps Online

Real Time Hacking Attacks We've recently covered a few real time hacking maps but have decided to extend the list based on the recent ransomware activities with some additional real time hacking attack and ransomware tracking maps. Ransomware refers to malicious ...
The Lighter Side Of The Cloud - Disaster Recovery Plan
Paul Mercina

Mitigating the Downtime Risks of Virtualization

Mitigating the Downtime Risks

Nearly every IT professional dreads unplanned downtime. Depending on which systems are hit, it can mean angry communications from employees and the C-suite and often a Twitterstorm of customer ire. See the recent Samsung SmartThings dustup for an example of how much trust can be lost in just one day.

Gartner pegs the financial costs of downtime at $5,600 per minute, or over $300,000 per hour. And a survey by IHC found enterprises experience an average of five downtime events each year, resulting in losses of $1 million for a midsize company to $60 million or more for a large corporation. In addition, the time spent recovering can leave businesses with an “innovation gap,” an inability to redirect resources from maintenance tasks to strategic projects.

The quest for downtime-minimizing technologies remains hot, especially as demand for high-availability IT has grown. Where “four nines” (99.99%) uptime might once have sufficed, five nines or six nines is now expected.

Enter server virtualization, the powerful technology enabling administrators to partition servers, increase utilization rates, and spread workloads across multiple physical devices. It’s a powerful and increasingly popular technology, but it can be a mixed blessing when it comes to downtime.

Mitigating the Downtime

Virtualization Minimizes Some Causes, Exacerbates Some Impacts of Downtime

Virtualization is no panacea, but that’s not a call to reconsider industry enthusiasm for it. Doing so would be unproductive anyway. The data center virtualization market, already worth $3.75 billion in 2017, is expected to grow to $8.06 billion by 2022. For good reason. Virtualization has many advantages, some of them downtime-related. For example, it’s easier to employ continuous server mirroring for more seamless backup and recovery.

These benefits are well documented by virtualization technology vendors like VMWare and in the IT literature generally. Less frequently discussed are the compromises enterprises make with virtualization, which often boil down to an “all eggs in one basket” problem.

What used to be discrete workloads running on multiple, separate physical servers can in a virtualized environment be consolidated to a single server. The combination of server and hypervisor then become a single point of failure, which can have an outside impact on operations for many reasons.

Increased utilization

First of all, today’s virtualized servers are doing more work. According to a McKinsey & Company report, utilization rates in non-virtualized equipment was mired at 6% to 12%, and Gartner research had similar findings. Virtualization can drive that figure up to 30% or 50% and sometimes higher. Even back-of-the-napkin math shows any server outage has several times the impact of yesteryear, simply because there is more compute happening within any given box.

Diverse customer consequences

Prior to virtualization, co-location customers, among others, demanded dedicated servers to handle their workloads. Although some still do, the cloud has increased comfort with sharing physical resources by using virtual machines (VMs). Now a single server with virtual partitions could be a resource for dozens of clients, vastly expanding the business impact of downtime. Instead of talking to one irate individual demanding a refund, customer service representatives could be getting emails, tweets, and calls from every corner.

This holds true for on-premises equipment as well. The loss of a single server could as easily affect the accounting systems the finance department relies on, the CRM system the sales team needs, and resources various customer-facing applications demand, all at the same time. It’s a recipe for help desk meltdown.

Added complexity

According to CIO Magazine, many virtualization projects “have shifted rather than eliminated complexity in the data center.” In fact, of the 16 annual outages per year their survey respondents reported, 11 were caused by system failure resulting from complexity. And the more complex the environment, the more difficult the troubleshooting process can be, which can lead to longer, more harmful downtime experiences.

Thin client

Although not a direct result of virtualization, the industry has made yet another swing of the centralization versus decentralization pendulum. After years of powerful PCs loaded with local applications, we have entered an age of mobile, browser-based, and other very thin client solutions. In many cases, the client does little but collect bits of data for processing elsewhere. Not much can happen at the device level if the cloud-based or other computing resources are unavailable. The slightest problem can result in mounting user frustration as apps crash and error messages are returned.

In summary, the data center of 2018 houses servers that are doing more, for more internal and external customers. At the same time complexity is bringing about downtime risk with problems that can be more difficult to solve, which can lead to extended outages. Although effective failover, backup, and recovery processes can help mitigate the combined effects, these tactics alone are not enough.

Additional Solutions for Minimizing Server Downtime

It may sound old school, but data center managers need to stay focused on IT equipment. These failures account for 40% of all reported downtime. Compare that figure with the 25% caused by human error, whether by internal staff or service providers, and the 10% by cyberattacks. To have the greatest positive effect on uptime, hardware should obviously be the first target.

There are several recommendations data center managers should implement, if they haven’t already done so:

  • Perform routine maintenance regularly. It should go without saying but often doesn’t. Install recommended patches, check for physical issues like airflow blockages, and heed all alerts and warnings. Maintenance is fundamental but it is no less essential. That means training employees, scheduling tasks, and tracking completion. If maintenance can’t happen on time, all the time, seek outside assistance to get it done so available internal resources can focus on strategic projects and those unavoidable fire drills without leaving systems in jeopardy.
  • Monitor your resources. The first you hear of an outage should never be from a customer. Full-time, 24/7 systems monitoring is a must for any enterprise. Fortunately, there are new, AI-driven technologies combining monitoring with advanced predictive maintenance capabilities for immediate fault detection and integrated, quick-turnaround response. Access is less expensive than you might think.
  • Upgrade your break/fix plan. A disorganized parts closet or an eBay strategy won’t work. Rapid access to spares is vital in getting systems back online without delay. Especially for mission critical systems, station repair kits on site or work with a vendor who can do so and/or deliver spares within hours.
  • Invest in expertise. Parts are only part of the equation. There is significant skill involved in troubleshooting systems in these increasingly complex data center environments. The current IT skills gap may necessitate looking outside the enterprise to complement existing engineering capabilities with those of a third-party provider.
  • Test everything. Data centers evolve, but conducting proof-of-principle testing on each workload before any changes are made will cut down on virtualization problems before they happen. By the same token, systems recovery and DR scenarios are unknowns unless they are real-world verified. Try pulling a power cord and see what happens. Does that idea give you pause? It might be time for some enhancements.

There is good news for IT organizations already overwhelmed by demands to maintain more complex environments, execute the digital transformation, and achieve it all with fewer resources and less money, in a tight labor market to boot. Alternatives exist.

Third-party maintenance providers can take on a substantial portion of the equipment-related upkeep, troubleshooting, and support tasks in any data center. With a premium provider on board, it’s possible radically reduce downtime and reach the availability and reliability goals you’d hoped to achieve when you took the virtualization path in the first place.

By Paul Mercina

Paul Mercina

Director of Product Management

Paul Mercina brings over 20 years of experience in IT center project management to Park Place Technologies, where he has been a catalyst for shaping the evolutionary strategies of Park Place’s offering, tapping key industry insights and identifying customer pain points to help deliver the best possible service. A true visionary, Paul is currently implementing systems that will allow Park Place to grow and diversify their product offering based on customer needs for years to come.

His work is informed by more than a decade at Diebold Nixdorf, where he worked closely with software development teams to introduce new service design, supporting implementation of direct operations in a number of countries across the Americas, Asia and Europe that led to millions of dollars in cost savings for the company.

Mercina shares his technology and business expertise as an adjunct professor at Walsh University’s DeVille School of Business, where he instructs courses on business negotiations, business and project management, and marketing.

View Website

CLOUDTWEAKS COMMUNITY PARTNERS

Each year we provide a number of highly customized branded programs to community support partners and going into our 10th year at CloudTweaks is no different. Sponsorship opportunities will be available for all budgets and sizes including the (premium) thought leadership exposure program or the webinar, podcast, white paper or explainer video lead generation programs.  Contact us for more information on these opportunities.

Cloud Community Supporters

(ISC)²
Cisco
SAP
CA Technologies
Dropbox

Cloud community support comes from (paid) sponsorship or (no cost) collaborative network partnership initiatives.