The Hypothermic Cloud Infrastructure: Maintaining the Blood Flow to Tier 1 & 2 Apps

The Hypothermic Cloud Infrastructure: Maintaining the Blood Flow to Tier 1 & 2 Apps

A particular sore spot since the beginning of the rise of cloud infrastructures, even the advent of utility computing is this: how do you forecast and pay for resources and justify costs for problems you don’t have yet? After all, the entire premise behind funding the acquisition of compute resources is that you are solving an already identified need (problem) which you then attach a cost and an ROI to in order to convince management that they should cut a check. That model doesn’t quite work in a cloud environment because usually the funding request is future looking (to solve tomorrow’s need) therefore ROI is difficult if not impossible to predict. Here at GreenPages, the more we talk to customers about cloud and cloud technologies, and as the technologies themselves evolve into ever greater levels of sophistication and capabilities, we’re finding some very interesting and innovative ways to help solve exactly that.

One example/issue that I’ve been thinking about lately in particular is what I’m calling the “Hypothermic Cloud Infrastructure.”  Hypothermia, as you may know, is defined as a condition in which a human’s core body temperature drops below the required temperature for normal metabolism and body functions which is defined as 35.0 °C (95.0 °F).  When a human body is exposed to frigid temperatures, say from falling into freezing water, the body’s temperature starts to drop in direct proportion to the amount of time of exposure.  In an attempt to preserve life below a certain internal temperature, the autonomic functions of the brain start to restrict the flow of blood to the extremities in order to maintain its core temperature.  As the core body temp gets colder and colder, less and less blood is allowed to flow so, in short, the body is prepared and quite willing to lose arms and legs to frostbite in order to save the life of the person…in order to keep the core temperature maintained above a certain threshold.

In the Hypothermic Cloud Infrastructure, we are equating that datacenter infrastructure to a human body in the sense that there are “core” applications and there are “extremity” applications that together make up the whole human, er, um, infrastructure.We define core as those applications (i.e. Tier 1 or 2) that are extremely business critical and have the highest level of attention, and if they were to become unavailable, truly awful things would happen, the least of which is that a great amount of money might be lost…the most being several jobs are toast. Conversely, on the other end of the spectrum, extremity applications might be called Tier 5 or 6 or even Tier 10 and, while they are important to the business (otherwise why have them at all), they are used infrequently by only a few people or for some reason are deprecated within the tiered infrastructure.  These applications can disappear for months at a time and effectively no one notices…until those applications are needed of course.

Blood Flow

To create the Hypothermic Cloud Infrastructure, there has to be a way to organize, manage and control the “blood flow” (the amount of underlying resources that are supplied-compute, memory, storage and networking-to all of the applications within the infrastructure) and to maintain the association of their status; i.e. are they a “core” or are they an “extremity” application.  The facility that would do this organization, management and control is a highly performant policy engine (the autonomic brain function) that is integrated directly to the automation and orchestration engine within the Hypothermic Cloud Infrastructure. What this policy engine does, one of the many things it does, is deeply understand all of the relevant application’s priorities, and it makes policy decisions in order to produce the best possible outcome. That is to say that the most important applications never suffer a degradation or an outage while the least are minimally taken care of.  So, what’s a policy decision?  Well, in order to answer that let me give an example of the Hypothermic Cloud Infrastructure in action, and I think that that will adequately describe, at least at a high level, what policy decisions are and why they are so important.

In the Hypothermic Cloud Infrastructure there are several thousands of applications running and they all have different levels of priority; sometimes that priority changes very little and other times it changes several time per week or day.  The priority might change in reaction to business events or seasonal adjustments or because you made that explicit decision…for whatever reason…but the point is that they do change…today it might be a core application and tomorrow it’s an extremity application.

One fine summer day, one of the Tier 1 applications begins to consume more and more resources (above and beyond what has been deemed normal) and while the IT staff starts their investigation into “why,” the Hypothermic Cloud Infrastructure knows that this application, based on its SLA, cannot fall below a certain performance level, no matter what, so the policy engine kicks off the process to seek out all of the lowest tier applications, selects a few that match some relevant criteria (i.e. the right sort of required resources), archives those applications and shifts the formerly in-use resources over to support the Tier 1 application in order to continue to maintain application performance.  There is no scrambling like a madman trying to find additional resources…it just happens.

The New Normal

While the engineering staff continues the investigation into why the application suddenly went batshi…er, crazy, the policy engine examines the characteristics of the running application and might determine that this is the “new normal” and permanently assign those resources to the Tier 1 application. This configuration is then automatically recognized, recorded and committed to the CMDB and those resources will never again be available for the lower tier applications.  Because some other fine day in the future, the lower tier applications will eventually need to be brought out of archive status, additional processes (automated and manual) are kicked off for procuring new resources (i.e. hardware) required to support the lower tier applications. The difference now is that it is a well regulated and orderly process…with plenty of time to order standard equipment…and there are no fire drills based on an artificial emergency that would drive exorbitant prices! However, if the policy engine determines that this is not the new normal, it will wait until the application settles down and the need for additional resources is no longer imperative, and will then release them back into the resources pools, pull the lower tier applications out of the archive, re-assign resources and spin the applications up as if nothing happened.  No additional resources are required, and the Tier 1 Application maintained its SLA throughout the process.

Today’s Technology

So, as you can see, the Hypothermic Cloud Infrastructure does solve the problem of reacting to abnormal system conditions and effectively predicting resource requirements but no one is saying that such a system with such a policy engine is out there sitting on your favorite ISV’s shelf, waiting to be downloaded.  What we are saying is that the technology that enables all of this capability does exist but, at this point, it may exist in a few different places, available from a few different vendors, called a whole bunch of different names…but it does exist.

I’m kinda hoping we’re the first to find it…but…no one’s stopping you

By Trevor Williamson

About CloudTweaks

Established in 2009, is recognized as one of the leading authorities in cloud computing information. Most of the excellent CloudTweaks articles are provided by our own paid writers, with a small percentage provided by guest authors from around the globe, including CEOs, CIOs, Technology bloggers and Cloud enthusiasts. Our goal is to continue to build a growing community offering the best in-depth articles, interviews, event listings, whitepapers, infographics and much more...

View All Articles

2 Responses to The Hypothermic Cloud Infrastructure: Maintaining the Blood Flow to Tier 1 & 2 Apps

  1. Interesting article, it reads nice in your comparison to the human body. Content wise, basically you are describing common understandable processes and techniques in IT that also are needed for cloud environments: Governance of your entire IT environment, System Management across your cloud provider images and cloud brokerage.
    On the tiers, let´s not make things more difficult than they are: I would propose 4 tiers. Each tier is a quadrant of mission critical yes/no and strategic yes/no:
    Tier 1. Key –  Most critical
    Tier 2. Mission critical
    Tier 3. Strategic but not mission critical
    Tier 4. Not strategic or mission critical
    You first look at the requirements of your application and than decide which cloud solution fits these requirements best, price is not the only thing to look at. Also take security, resiliency and availability requirements into account.
    On the whole mechanism described, that´s the elastic scaling that cloud providers are providing or should provide. Elastic scaling is scaling up/down horizontally based on the actual load on the servers. This remove the whole ´shuffle of resources´ you mention, which is even not realistic when you actually spread your tiered applications across multiple cloud provides.
    And lastly, applications should be monitored on end-to-end response times to determine the user experience and not purely the load on the servers hosting the application.

  2.  @schoutene
     I agree on all of your points as to tiering but must add that for the system to be truly effective, you must unleash the policy engine from the derived model and let it substantially adjust the system based on the best interests of the system…all within the constraints you impose by the local and global rules you write.  An example is that after you do all of the analysis of a composite application (multi component; DB server, app server, web server) and it should be holistic and end to end analysis (its Quality of Experience or QoE) and initially assign it a tier or priority, you then let the system then use the applications behavior as a modifying force from there on.  You might write a local rule for that application that sets a minimum tier but otherwise, the system sets the tier and takes action based in that and the rest of the behavior of the system.While, again, I agree that this can be termed elasticity, it goes far beyond just taking into account load on servers but addresses the non-technology aspects of applications such as political investment, compliance regime, time (clock and calendar) as well as qualitative characteristics assigned by the system owners. 

The Lighter Side Of The Cloud – What Next?

Contributor Spotlight

Steve Prentice
Jennifer Klostermann

CloudTweaks is recognized as one of the leading influencers in cloud computing, infosec, big data and the internet of things (IoT) information. Our goal is to continue to build our growing information portal by providing the best in-depth articles, interviews, event listings, whitepapers, infographics and much more.

CloudTweaks Comic Library