Auto Scaling Strategies For Windows Azure, Amazon’s EC2 And Other Cloud Platforms

The strategies discussed in this article can be applied to any cloud platform that has an ability to dynamically provision compute resources, even though I rely on examples from AzureWatch auto scaling and monitoring service for Windows Azure

The topic of auto scaling is an extremely important one when it comes to architecting cloud-based systems.  The major premise of cloud computing is its utility based approach to on-demand provisioning and de-provisioning of resources while paying only for what has been consumed.  It only makes sense to give the matter of dynamic provisioning and auto scaling a great deal of thought when designing your system to live in the cloud.  Implementing a cloud-based application without auto scaling is like installing an air-conditioner without a thermostat: one either needs to constantly monitor and manually adjust the needed temperature or pray that outside temperature never changes.

Many cloud platforms such as Amazon’s EC2 or Windows Azure do not automatically adjust compute power dedicated to applications running on their platforms.  Instead, they rely upon various tools and services to provide dynamic auto scaling.  For applications running in Amazon cloud, auto-scaling is offered by Amazon itself via a service CloudWatch as well as third party vendors such as RightScale.  Windows Azure does not have its own auto scaling engine but third party vendors such as AzureWatch can provide auto scaling and monitoring.

Before deciding on when to scale up and down, it is important to understand when and why changes in demand occur.  In general, demand on your application can vary due to planned or unplanned events.  Thus it is important to initially divide your scaling strategies into these two broad categories: Predictable and Unpredictable demand.

The goal of this article is to describe scaling strategies that gracefully handle unplanned and planned spikes in demand.  I’ll use AzureWatch to demonstrate specific examples of how these strategies can be implemented in Windows Azure environment.  Important note: even though this article will mostly talk about scale up techniques, do not forget to think about matching scale down techniques. In some cases, it may help to think about building an auto scaling strategy in a way similar to building a thermostat.

Unpredictable demand

Conceptualizing use-cases of rarely occurring unplanned spikes in demand is rather straight forward.  Demand on your app may suddenly increase due to a number of various causes, such as:

  • an article about your website was published on a popular website
  • CEO of your company just ordered a number of complex reports before a big meeting with shareholders
  • your marketing department just ran a successful ad campaign and forgot to tell you about the possible influx of new users
  • a large overseas customer signed up overnight and started consuming a lot of resources

Whatever the case may be, having an insurance policy that deals with such unplanned spikes in demand is not just smart.  It may help save your reputation and reputation of your company.  However, gracefully handling unplanned spikes in demand can be difficult.  This is because you are reacting to events that have already happened.  There are two recommended ways of handling unplanned spikes:

Strategy 1: React to unpredictable demand

When utilization metrics are indicating high load, simply react by scaling up.  Such utilization metrics can usually include CPU utilization, amount of requests per second, number of concurrent users, amounts of bytes transferred, or amount of memory used by your application.  In AzureWatch you can configure scaling rules that aggregate such metrics over some amount of time and across all servers in the application role and then issue a scale up command when some set of averaged metrics is above a certain threshold.  In cases when multiple metrics indicate change in demand, it may also be a good idea to find a “common scaling unit”, that would unify all relevant metrics together into one number.

Strategy 2: React to rate of change in unpredictable demand

Since scale-up and scale-down events take some time to execute, it may be better to interrogate the rate of increase or decrease of demand and start scaling ahead of time: when moving averages indicate acceleration or deceleration of demand.  As an example, in AzureWatch’s rule-based scaling engine, such event can be represented by a rule that interrogates Average CPU utilization over a short period of time in contrast to CPU utilization over a longer period of time

(Fig: scale up when Average CPU utilization for the last 20 minutes is 20% higher than Average CPU utilization over the last hour and Average CPU utilization is already significant by being over 50%)

Also, it is important to keep in mind that scaling events with this approach may trigger at times when it is not really needed: high rate of increase will not always manifest itself in the actual demand that justifies scaling up.  However, in many instances it may be worth it to be on the safe side rather than on the cheap side.

Predictable demand

While reacting to changes in demand may be a decent insurance policy for websites with potential for unpredictable bursts in traffic, actually knowing when demand is going to be needed before it is really needed is the best way to handle auto scaling.  There are two very different ways to predict an increase or decrease in load on your application.  One way follows a pattern of demand based on historical performance and is usually schedule-based, while another is based on some sort of a “processing queue”.

Strategy 3: Predictable demand based on time of day

There are frequently situations when load on the application is known ahead of time.  Perhaps it is between 7am and 7pm when a line-of-business (LOB) application is accessed by employees of a company, or perhaps it is during lunch and dinner times for an application that processes restaurant orders.  Whichever it may be, the more you know at what times during the day the demand will spike, the better off your scaling strategy will be.  AzureWatch handles this by allowing to specify scheduling aspects into execution of scaling rules.

Strategy 4: Predictable demand based on amount of work left to do

While schedule-based demand predictions are great if they exist, not all applications have consistent times of day when demand changes.  If your application utilizes some sort of a job-scheduling approach where the load on the application can be determined by the amount of jobs waiting to be processed, setting up scaling rules based on such metric may work best.  Benefits of asynchronous or batch job execution where heavy-duty processing is off-loaded to back-end servers can not only provide responsiveness and scalability to your application but also the amount of waiting-to-be-processed job can serve as an important leading metric in the ability to scale with better precision.  In Windows Azure, the preferred supported job-scheduling mechanism is via Queues based on Azure Storage.  AzureWatch provides an ability to create scaling rules based on the amount of messages waiting to be processed in such a queue.  For those not using Azure Queues, AzureWatch can also read custom metrics through a special XML-based interface.

Combining strategies

In the real world, implementing a combination of more than one of the above scaling strategies may be prudent.  Application administrators likely have some known patterns for their applications behavior that would define predictable bursting scenarios, but having an insurance policy that would handle unplanned bursts of demand may be important as well.  Understanding your demand and aligning scaling rules to work together is key to successful auto scaling implementation.

By Igor Papirov /Paraleap Technologies

About CloudTweaks

Established in 2009, CloudTweaks is recognized as one of the leading authorities in connected technology information and services.

We embrace and instill thought leadership insights, relevant and timely news related stories, unbiased benchmark reporting as well as offer green/cleantech learning and consultive services around the world.

Our vision is to create awareness and to help find innovative ways to connect our planet in a positive eco-friendly manner.

In the meantime, you may connect with CloudTweaks by following and sharing our resources.

View All Articles

Sorry, comments are closed for this post.

Comic
When Sci-Fi Predictions Come To Fruition

When Sci-Fi Predictions Come To Fruition

Evolution of Technologies To paraphrase science fiction author Arthur C. Clark, those who make predictions about the future are either “considered conservative now and mocked later, or mocked now and proved right when they are no longer around to enjoy the acclaim.” The one thing we can be sure about, Clark ventured, is that “[the…

Facebook Hopes To Extend Internet Connectivity With Solar-Powered Drones

Facebook Hopes To Extend Internet Connectivity With Solar-Powered Drones

Facebook Inc (FB.O) said on Thursday it had completed a successful test flight of a solar-powered drone that it hopes will help it extend internet connectivity to every corner of the planet. Aquila, Facebook’s lightweight, high-altitude aircraft, flew at a few thousand feet for 96 minutes in Yuma, Arizona, Chief Executive Mark Zuckerberg wrote in…

When Will Women In Tech Become The Norm?

When Will Women In Tech Become The Norm?

Tech Diversity It is well known that the technology industry has been dominated by men, but it is also clear that the industry is working to change that. Diversity in the tech industry, especially where it applies to women in tech, has been a topic of discussion for years. Recently the Washington Technology Industry Association…

Four Keys For Telecoms Competing In A Digital World

Four Keys For Telecoms Competing In A Digital World

Competing in a Digital World Telecoms, otherwise largely known as Communications Service Providers (CSPs), have traditionally made the lion’s share of their revenue from providing pipes and infrastructure. Now CSPs face increased competition, not so much from each other, but with digital service providers (DSPs) like Netflix, Google, Amazon, Facebook, and Apple, all of whom…

Edtech and Virtual Reality – Exciting Learning Environment

Edtech and Virtual Reality – Exciting Learning Environment

Customizing Edutech Customized edtech learning solutions are becoming more commonplace as the education industry recognises their potential and begins transforming the traditional structures so as to incorporate innovative developments. From textbooks to tablets, chalkboards to virtual reality, edtech promises not only dynamic and exciting learning environments but better learning strategies and solutions. Virtual Reality and…

Four Recurring Revenue Imperatives

Four Recurring Revenue Imperatives

Revenue Imperatives “Follow the money” is always a good piece of advice, but in today’s recurring revenue-driven market, “follow the customer” may be more powerful. Two recurring revenue imperatives highlight the importance of responding to, and cherishing customer interactions. Technology and competitive advantage influence the final two. If you’re part of the movement towards recurring…

Data Breaches: Incident Response Planning – Part 1

Data Breaches: Incident Response Planning – Part 1

Incident Response Planning – Part 1 The topic of cybersecurity has become part of the boardroom agendas in the last couple of years, and not surprisingly — these days, it’s almost impossible to read news headlines without noticing yet another story about a data breach. As cybersecurity shifts from being a strictly IT issue to…

Adopting A Cohesive GRC Mindset For Cloud Security

Adopting A Cohesive GRC Mindset For Cloud Security

Cloud Security Mindset Businesses are becoming wise to the compelling benefits of cloud computing. When adopting cloud, they need a high level of confidence in how it will be risk-managed and controlled, to preserve the security of their information and integrity of their operations. Cloud implementation is sometimes built up over time in a business,…

5 Ways To Ensure Your Cloud Solution Is Always Operational

5 Ways To Ensure Your Cloud Solution Is Always Operational

Ensure Your Cloud Is Always Operational We have become so accustomed to being online that we take for granted the technological advances that enable us to have instant access to everything and anything on the internet, wherever we are. In fact, it would likely be a little disconcerting if we really mapped out all that…

Your Biggest Data Security Threat Could Be….

Your Biggest Data Security Threat Could Be….

Paying Attention To Data Security Your biggest data security threat could be sitting next to you… Data security is a big concern for businesses. The repercussions of a data security breach ranges from embarrassment, to costly lawsuits and clean-up jobs – particularly when confidential client information is involved. But although more and more businesses are…

Disaster Recovery And The Cloud

Disaster Recovery And The Cloud

Disaster Recovery And The Cloud One of the least considered benefits of cloud computing in the average small or mid-sized business manager’s mind is the aspect of disaster recovery. Part of the reason for this is that so few small and mid-size businesses have ever contemplated the impact of a major disaster on their IT…

Utilizing Digital Marketing Techniques Via The Cloud

Utilizing Digital Marketing Techniques Via The Cloud

Digital Marketing Trends In the past, trends in the exceptionally fast-paced digital marketing arena have been quickly adopted or abandoned, keeping marketers and consumers on their toes. 2016 promises a similarly expeditious temperament, with a few new digital marketing offerings taking center stage. According to Gartner’s recent research into Digital Marketing Hubs, brands plan to…

Cloud Infographic – The Data Scientist

Cloud Infographic – The Data Scientist

Data Scientist Report The amount of data in our world has been exploding in recent years. Managing big data has become an integral part of many businesses, generating billions of dollars of competitive innovations, productivity and job growth. Forecasting where the big data industry is going has become vital to corporate strategy. Enter the Data…

Infographic Introduction – Benefits of Cloud Computing

Infographic Introduction – Benefits of Cloud Computing

Benefits of Cloud Computing Based on Aberdeen Group’s Computer Intelligence Dataset, there are more than 1.6 billion permutations to choose from when it comes to cloud computing solutions. So what, on the face of it, appears to be pretty simple is actually both complex and dynamic regardless of whether you’re in the market for networking,…

Disaster Recovery – A Thing Of The Past!

Disaster Recovery – A Thing Of The Past!

Disaster Recovery  Ok, ok – I understand most of you are saying disaster recovery (DR) is still a critical aspect of running any type of operations. After all – we need to secure our future operations in case of disaster. Sure – that is still the case but things are changing – fast. There are…

Using Big Data To Make Cities Smarter

Using Big Data To Make Cities Smarter

Using Big Data To Make Cities Smarter The city of the future is impeccably documented. Sensors are used to measure air quality, traffic patterns, and crowd movement. Emerging neighborhoods are quickly recognized, public safety threats are found via social networks, and emergencies are dealt with quicklier. Crowdsourcing reduces commuting times, provides people with better transportation…