Business Continuity: Let Us Plan For A Cloud Failure

Business Continuity: Let Us Plan For A Cloud Failure

Every cloud proponent has been saying that the cloud is the only safe place left for our businesses and various other IT systems, while “nay-sayers” blabber about the total opposite, they talk about privacy and security concerns, and even data integrity. But no matter which side you are on, nothing is full-proof and all will eventually fail in one way or another. Even with business continuity plans and implementations, geographically scattered backup systems, there will always be that one big problem in the future that will ensure that all of them go down simultaneously.

But the scenario above is quite unlikely unless we are talking about a disaster on a global scale. Yes, cloud computing may fail, but not all of it, and not at once. This is the beauty of cloud computing, its disjointedness, its seemingly random choice of server locations, and of course the sheer number of them when you combine those of different service providers. Hence, the first step for a real business continuity plan is to plan for cloud failures, not just local failures.

The key is to design your IT infrastructure around the idea that one of the servers hosting your applications WILL go down in the future. Then you find your solution for this. A simple solution is to scatter those servers around the globe, using different providers or one provider that can provide you with control on which servers you use to run your services from. Make sure that codependent systems and subsystems can act independently to some extent. For example if a certain function is down, make sure another function will take its place to try to act as a backup with some functionality rather than have the whole system go down. We can now launch servers anywhere in the world using a laptop or even a smartphone, and have them run for a few cents an hour. There are so many options out there and we are now at a state that the level of affordability of business continuity is unimaginable just a few years ago.

One very good way to test the overall resilience of the system is to randomly shut off a part of the system to see if the whole will still work. This also gives developers a way to test interconnectivity and the integrity of your business continuity solution. Netflix, a real avid user of the Amazon Web Services (AWS) cloud service, calls their version of this “random kill” system the “Chaos Monkey” because of the unpredictability of which service will be shut off next. This is a very good test case for a high load system which is in high demand due to the nature of their service, streaming movies and videos.

By Abdul Salam

Abdul

Abdul Salam is IT professional and an accomplished technical writer with CloudTweaks. He earned his undergraduate degree in Information Technology followed by a postgraduate degree in Business Informatics. Abdul possess over 3 years’ experience in technical & business writing with deep knowledge in Cloud Computing, VMware,Oracle, Oracle ERP, Cloud ERP, Microsoft Technologies and Network Communications (Cisco, Juniper). Visit his LinkedIn profile at: http://linkd.in/TtFu7X

Sorry, comments are closed for this post.

Join Our Newsletter

Receive updates each week on news, tips, events, comics and much more...

Can I Contribute To CloudTweaks?

Yes, much of our focus in 2015 will be on working with other influencers in a collaborative manner. If you're a technology influencer looking to collaborate long term with CloudTweaks – a globally recognized leader in cloud computing information – drop us an email with “tech influencer” in the subject line.

Please review the guidelines before applying.