AWS Outage – Ground-Hog Day Meets Murphy’s Law; You Guys Should Get A Room!

AWS Outage – Ground-Hog Day Meets Murphy’s Law; You Guys Should Get A Room!

AWS Outage – Ground-Hog Day Meets Murphy’s Law; You Guys Should Get A Room!

So, here we go again – I’ve said it once, so I’ll say it again. It gives me no pleasure to write another blog post about AWS suffering another outage in their West Virginia Zone. This because of a couple of reasons: First, the publicity – industry analysts and commentators are divided into two camps, some taking the view that AWS is slightly unfit for the purpose (Barb Darrow from Gigaom: “Cloud outage raises more questions about Amazon Cloud” ) and others taking the more pragmatic view that Instagram, Netflix, etc. (AWS customers) could have been more proactive in protecting themselves against their host going offline (Ingrid Lunden from TechCrunch: “Could Instagram and other sites avoid going down with Amazons Ship”)

The twitter feed kicked in on Saturday morning after the outage on Friday night, and when I saw the first few tweets coming in, I thought it was just people catching up and re-posting regarding the previous AWS issue only two weeks before. But no; it was groundhog day all over again – storm hits; power cut; generators didn’t work; elastic cloud falls over; sites go down!

Checking the hash tags for #Netflix, #Instagram, #AWS and #AWSoutage, I saw all the expected reactions – AWS customers posting stuff like this: Instagram

Nearly 28,000 re-tweets, and similar for NetFlix, Pinterest, and Heroku.

The publicity for all of these companies is clearly not good – consumers don’t care or even know what a “host” is. Unless you work in IT, why would you care or want to know? To a consumer, the service they either pay for (Netflix) or use on an hourly basis (Instagram) just doesn’t work, and that type of damage is difficult to undo.

The second reason is that it just gives more fuel to the “I told you so” cloud naysayers. You can just hear old-school CIOs whispering to fearful CEOs all over the world, “the cloud is not ready for us, and we’re certainly not ready for it!”

But I feel I’m repeating myself a bit from my last post, so let’s move on and take an alternate view, one which I subscribe to, and one that infrastructure teams at Netflix et al. would do well to explore.

Putting all your eggs in one basket is clearly a strategy that is both good and bad; good, because you get to be a big customer of a provider; you get economies of scale, better pricing, and someone should pick the phone up when you call, etc., etc., but bad, because you give away some control. When AWS went down, it is clear that many infrastructure teams at customer sites who may have engineered their application to be redundant inside their host didn’t take into account the unthinkable – what happens if the host goes down?

As Michael Lee from ZDNet pointed out in a post on 2 July, quoting Intelligent Business Research Services advisor Jorn Bettin, the blame for the outage may have lain with providers failing to utilise cloud services as they should.

He said that the real issue wasn’t that such a huge cloud-services giant such as Amazon had stumbled over a storm, but that the affected customers – Instagram, Pinterest, Pocket and Netflix (which all suffered from Amazon’s recent outage on the weekend) – hadn’t used the ability of the cloud to create geographically redundant links.

“They could operate at a higher level of redundancy, so that these sort of outages would only have a minimal impact on them. It’s a matter of cost,” Bettin said.

This is the most sensible article I’ve read about the AWS outage issue thus far. Having one provider manage your entire infrastructure without a DR/Back-up strategy with another cloud provider is just commercial madness.

Now, I understand there is a cost element here – the cost of replicating some or all of your infrastructure to spin up when a disaster happens is expensive, isn’t it?

Well, yes and no.

Yes, it’s going to add some level of cost, but what you gain from that is control. You, the System Admin from Pintflixogram, get control to the extent that if your primary host goes down, you get to fire up another, secondary host and maintain your service. Let’s remember AWS is not the only hosting company on the planet. Although they may be perceived as such by many, but in fact there are plenty of regional outfits in the market that are not as cheap as AWS. But guess what – they don’t go down.

On the other hand, if you balance the reputational risk, the customer support calls you have to field, the tickets raised, the PR damage limitation exercise and, finally, the churn as your customer base leaves for your competitor, then no, it’s not expensive.

Companies seem to forget that the quality of hosting service you use is the public perception of your company. You can have the coolest website, the best marketing machine, an awesome product or service, but it all counts for nothing when your customer see’s this:

I can only imagine the frustration and sense of helplessness that the PR folks and the system admins felt, as there is literally nothing they can do to get their service up online until their host tells them they are back up online.

But if they had explored a strategy whereby the client had the control instead of the host, then it could have been service as normal.

AWS are getting hammered, which is understandable from a certain perspective – clients frustrated that their site has gone down; everyone in the space commenting that this shouldn’t happen – but really, the larger clients of AWS who could and should have explored “Redundancy across Regions” (RaR) strategies only have themselves to blame. There is not an industry on the planet that does not have some kind of back-up plan to maintain their core business in the event of a natural disaster, be it as simple as work from home, or a complete replication of their business environment somewhere else.

It’s clear here that some companies had no such plan and just blamed their host, when in fact, if you look at the big picture, it was their own fault. Murphy’s Law exists for a reason, and there are lessons to be learned here.

It’s very simple: You pay for what you get; you pay for greater control and security so that in the event of something bad happening, you don’t have a ground-hog day, and you also beat Murphy at his own game.

Guest Post By Jason Currill

Jason Currill is a seasoned executive with over 20 yearsʼ international sales and sales leadership experience in investment banking and information technology. In 2011 he founded Ospero, a global Infrastructure as a Service (IaaS) company. Prior to founding Ospero, he held leadership positions at Cisco Systems, Business Objects (an SAP company) and NetSuite, running both EMEA and NA theaters. In addition,  Jason spent 10 years as a leading Futures Trader in the London International Financial Futures Exchange (LIFFE) for SG Warburg, Nomura and ING Bank.

Follow Us!

CloudTweaks

Established in 2009, CloudTweaks.com is recognized as one of the leading authorities in cloud computing information. Most of the excellent CloudTweaks articles are provided by our own paid writers, with a small percentage provided by guest authors from around the globe, including CEOs, CIOs, Technology bloggers and Cloud enthusiasts. Our goal is to continue to build a growing community offering the best in-depth articles, interviews, event listings, whitepapers, infographics and much more...
Follow Us!

Sorry, comments are closed for this post.

Join Our Newsletter

Receive updates each week on news, tips, events, comics and much more...

Can I Contribute To CloudTweaks?

Yes, much of our focus in 2015 will be on working with other influencers in a collaborative manner. If you're a technology influencer looking to collaborate long term with CloudTweaks – a globally recognized leader in cloud computing information – drop us an email with “tech influencer” in the subject line.

Please review the guidelines before applying.

Contributors

Cloud Infographic – Wearable Tech And Preventative Healthcare

Cloud Infographic – Wearable Tech And Preventative Healthcare

Wearable Tech And Preventative Healthcare There are so many exciting new opportunities available to utilize wearable technology in the future.  Areas such as nanotechnology disease monitoring, crowdfunding to wearable accessories are some excellent examples of the potential. Estimates vary, but appear to suggest that the market will produce between $14-50 Billion over the next few years. Included below

Ten Tips For Successful Business Intelligence Implementation

Ten Tips For Successful Business Intelligence Implementation

Ten Tips for Successful Business Intelligence Implementation The cost of Business Intelligence (BI) software goes far beyond the purchase price. Time spent researching, implementing, and maintaining your BI investment can snowball quickly and mistakes are often expensive. Your time is valuable – save it by learning from other businesses’ experiences. We’ve compiled the top ten

Knots And Cloud Service Providers

Knots And Cloud Service Providers

How Do These Two Compare? In Boy Scouts, I learned how to tie knots. The quickest knot you can tie is the slipknot. It’s very effective for connecting one thing to another via the rope you have. It was used in setting up tents, mooring boats to docks temporarily and lifting your food up into

Aggregated News

Popular News Sources

Storage Considerations for SharePoint Backups

Storage Considerations for SharePoint Backups

Storage Considerations for SharePoint Backups Wednesday, October 29, 2014 @ 9:00 am/12:00pm ET. Backup and Restore of a SharePoint environment can be a complex endeavor as the product consists of multiple components running at various tiers, each with their own backup and restore requirements. In addition, SharePoint documents are stored as Binary Large Objects (BLOBs) in

OpenDNS Deployment Leads to Twenty-Fold Decrease in Malware Infections at Hamamatsu

OpenDNS Deployment Leads to Twenty-Fold Decrease in Malware Infections at Hamamatsu

Decreases in Malware Infections at Hamamatsu OpenDNS, a leading provider of cloud-delivered security, today announced that it has enabled Hamamatsu, a Japanese manufacturer of optical sensor technologies, to virtually eliminate malware infections across its U.S. Read the source article at Finance News About Latest Posts Follow Us!CloudTweaksEstablished in 2009, CloudTweaks.com is recognized as one of the

IBM and Microsoft – What Are They Doing With The Hybrid Cloud?

IBM and Microsoft – What Are They Doing With The Hybrid Cloud?

What Are They Doing With The Hybrid Cloud? “Microsoft is committed to helping enterprise customers realize the tremendous benefits of cloud computing across their own systems, partner clouds and Microsoft Azure,” said Scott Guthrie, executive vice president,Cloud and Enterprise, Microsoft. “With this … Read the source article at CNNMoney About Latest Posts Follow Us!CloudTweaksEstablished in 2009, CloudTweaks.com is recognized