Web Availability: Are You Feeling Lucky?

Cloud Availability

I’m a firm believer in having control over anything that can get me fired. So, while the cloud is wonderful for solving all sorts of IT issues, only the bold, the brave or the career suicidal place business-critical applications so completely out of their own control.

My company began pushing applications to the cloud around 2004. Today the majority of our applications are cloud-based. Our most important applications, however, stay in-house and run on fault-tolerant servers. I know everything about them … where they are, what platform they are running on, when and how they are maintained, where data are stored, what the current rev levels are for everything that touches them. More importantly, I know what is being done and by whom if the server goes down, which hasn’t happened in years. Thanks to how my platform is architected, I can be reasonably sure when applications will be back up and running. And, problem’s root cause will not be lost to the ether. This is how I sleep well at night.

On the other hand, having a critical application go offline in the cloud is a CIO’s nightmare. The vendor is as vague about the problem as it is estimating recovery time, saying (or, posting to Twitter) only that they are looking in to it. Of the thousands or millions of clients they have (think Go Daddy), whose applications come back first and whose are last? No matter how cleverly you phrase your response when the executive office calls for a status update, the answer still comes across as, “I have no idea what’s going on.”

No worries, you have a failover plan to switch to another location or back-up provider. This being the first time you are actually doing it for real, some critical dependencies or configuration errors surface that were missed in testing. All this also adds cost and complexity to a solution that was supposed to yield the opposite result.

Why this is important

Getting sacked notwithstanding, losing critical applications to downtime is extremely costly, whether they reside in the cloud or internal data center. Many may think this is stating the obvious. In our experience, corroborated by ample industry research, more than half of all companies make no effort to measure downtime costs. Those who do, usually underestimate by a wide margin.

Cost-of-downtime estimates provided by a number of reputable research firms exceed $100,000 per hour for the average company. The biggest cost culprits, of course, are the applications your company relies on most and would want up and running first after an outage. The thought of ceding responsibility to a third-party for keeping these applications available 24/7 … whose operations you have no control over, whose key success metric is the lowest possible cost per compute cycle, whose SLAs leave mission-critical applications hanging over the precipice … is anathema.

This is not an indictment against cloud Service Providers. This is only the current reality, which will improve with time. Today’s reality is completely acceptable for more enterprise applications than not, as it is in my company. Regrettably for some companies, it’s even acceptable for critical workloads.

At a recent CIO conference my conversation with a peer from a very recognizable telecom and electronics company turned to application availability. I was confounded to hear him declare how thrilled he’d be with 99.9% uptime for critical applications, which I believe is the level most cloud providers aspire to, and ordinary servers are capable of. If analysts’ downtime cost estimates are anywhere close to reality, 99.9% uptime translates into about $875,000 in cost per year for the average company. This was a Fortune 500 firm.

Determining the total of hard and soft downtime costs is not easy, which is why it’s often not done well if at all. For example, downtime impact can ripple to departments and business functions beyond the core area. There may be contractual penalties. News headlines may be written.

Making technology choices without knowing your complete downtime costs is a crap shoot. Making informed ROI decisions is impossible. You may even find that savings from moving not-so-critical applications to the cloud are inconsequential, as I did with our company’s email system. That will stay in-house. And, I will continue to sleep soundly.

By Joe Graves – CIO of Stratus Technologies

Joe was named CIO of Stratus Technologies in 2002.  During his tenure, Joe has recreated the Stratus IT environment using innovative approaches such as virtualization and Software-as-a-Service (SaaS). Prior to becoming CIO, he was responsible for managing IS operations followed by IT application development. Prior to Stratus, Joe held various software engineering positions with Sequoia Systems and Data General.

Gamestop NFT

Could GameStop Issue An NFT Dividend?

NFT Dividends A Non-Fungible Token (NFT) is a piece of data that is stored on a blockchain that certifies a digital asset to be unique. An NFT can represent pictures, videos, GIFs, audio and other ...
Derrek Schutman

Implementing Digital Capabilities Successfully to Boost NPS and Maximize Value Realization

Implementing Digital Capabilities Successfully Building robust digital capabilities can deliver huge benefits to Digital Service Providers (DSPs). A recent TMForum survey shows that building digital capabilities (including digitization of customer experience and operations), is the ...
Fernando Castanheira

How the Shift to Hybrid Work Will Impact Digital Transformations

The Shift to Hybrid Work Before COVID-19, most enterprises had a digital transformation in flight, but the pandemic threw those programs into hyperdrive. Scrambling to accommodate workforces that were suddenly working online and mostly from ...
David Loo

The Long-term Costs of Data Debt: How Inaccurate, Incomplete, and Outdated Information Can Harm Your Business

The Long-term Costs of Data Debt It’s no secret that many of today’s enterprises are experiencing an extreme state of data overload. With the rapid adoption of new technologies to accommodate pandemic-induced shifts like remote ...
Brian Rue

What’s Holding DevOps Back

What’s Holding DevOps Back And How Developers and Businesses Can Vault Forward to Improve and Succeed Developers spend a lot of valuable time – sometimes after being woken up in the middle of the night ...

CLOUD MONITORING

The CloudTweaks technology lists will include updated resources to leading services from around the globe. Examples include leading IT Monitoring Services, Bootcamps, VPNs, CDNs, Reseller Programs and much more...

  • Opsview

    Opsview

    Opsview is a global privately held IT Systems Management software company whose core product, Opsview Enterprise was released in 2009. The company has offices in the UK and USA, boasting some 35,000 corporate clients. Their prominent clients include Cisco, MIT, Allianz, NewVoiceMedia, Active Network, and University of Surrey.

  • Nagios

    Nagios

    Nagios is one of the leading vendors of IT monitoring and management tools offering cloud monitoring capabilities for AWS, EC2 (Elastic Compute Cloud) and S3 (Simple Storage Service). Their products include infrastructure, server, and network monitoring solutions like Nagios XI, Nagios Log Server, and Nagios Network Analyzer.

  • Datadog

    DataDog

    DataDog is a startup based out of New York which secured $31 Million in series C funding. They are quickly making a name for themselves and have a truly impressive client list with the likes of Adobe, Salesforce, HP, Facebook and many others.

  • Sematext Logo

    Sematext

    Sematext bridges the gap between performance monitoring, real user monitoring, transaction tracing, and logs. Sematext all-in-one monitoring platform gives businesses full-stack visibility by exposing logs, metrics, and traces through a single Cloud or On-Premise solution. Sematext helps smart DevOps teams move faster.