A Hitchhikers Guide To The Cloud
The cloud has turned traditional data management on its head. People still think they will be able to command endless resources while being able to deploy, run and consume a distributed data solution anywhere, anytime. But there are key database challenges when putting your applications in the cloud that are unique and can’t be ignored.
1. High Availability
In the cloud, high availability isn’t just about hardware resiliency anymore. Because customers are removed from the actual hardware, you can no longer plug in an extra power supply, network card or swap hard drives if anything goes wrong or if you need additional resources. The cloud is not predictable and needs to be closely monitored and managed. Besides the large-scale cloud outages we all know about (such as the Amazon-EC2 US-East meltdown recently), we have to face the facts: server crashes, hardware malfunctions and other slipups are all part of the territory in the cloud. With that in mind – cloud users need to be prepared. There are hardly any SLAs for databases in the cloud. To prepare and run effectively on the dynamic cloud environment, every database, regardless of its size, must run in a replicable set-up, which is typically more complex and expensive. Maintaining high availability depends on the availability of “more of the same” resources and the ability to dynamically provision them on the fly once a failure is identified.
Often, when speaking about high availability of the database, the most obvious component is the availability of the data itself, via replication. Yet, if we remember that high availability means that the application continues to communicate with the database as usual, then the plot thickens.
Any solution to ensure availability of the data storage layer must address the availability of the front-end layer of accessing the data so that the connection between the application and any operational DB replicas is not compromised.
On the data storage level, you’ll need to replicate the data, ensure data consistency and synchronization across replicas. Then you’ll need to set up an auto-failover mechanism to monitor the cloud for any failures, identify when a replica is unresponsive and continue service from the remaining surviving replicas.
This then creates the issue of how to manage high-availability of the DB connection. Procedures to ensure high availability, as well as numerous scaling considerations – in particular scale-out – result in several copies of the database that can all be accessed via different addresses. If a replica crashes, you need to monitor which hostname is no longer available and then stop directing traffic from the application to this database.
How do you manage high availability of your database connection string, in a dynamic environment prone to failures in addition to being prone to scaling-out to additional servers (and additional connection addresses) to accommodate bursts in demand?
One way to ensure high availability of the database connection would be to provide users with the multiple front-end addresses and ports and let them handle it on the application level. Users will have to manage connection failovers, load balancing, etc. Needless to say, this will be a major headache as the developer would need to manage the DB connections 24/7 by connecting the app with an available database front-end (very similar to phone operators in the early days).
Another way is to embed high availability into a driver. The driver will be supplied with the initial front-end details and would be updated automatically with any additional front-ends. The driver will balance the connections between front-ends, handle failover automatically, and move connections seamlessly in scale-in and scale-out cases without closing the connection on the user side.
The third alternative is a balancer component on servers, which is installed with each database instance. This component takes care of balancing and availability between all the front-end nodes and seamlessly moves connections when scaling in or out.
Database management systems are not only complex systems, but they are also key components in the operation of most software stacks. Given its criticality and complexity, operating a database can be a daunting task that requires significant expertise and considerable resources that are not always readily available to everyone.
Maintaining high availability requires continuously monitoring the cloud environment for any failures, configuring auto-failover mechanisms and keeping multiple copies of the database tier always synchronized and ready to spring into action. Ensuring elasticity means you need to monitor and re-configure and deploy your servers (and sometimes change your app) to add additional resources or to remove them if they go underutilized.
Developers flock to the cloud, and with good reason. The flip side is that once the application gains momentum, it requires a skill set not readily available for most developers. To allow developers to focus on their code rather than on the IT, the cloud ecosystem provides a myriad of off-the-shelf development platforms and cloud services to integrate with to streamline development and time-to-production.
Scalability and elasticity are the trendiest words in the database arena these days – everybody scales, and everybody claims that only they scale the right way.
Scaling an application (by adding additional servers and load balancers) is pretty much a no-brainer, and many cloud providers offer that. Some, like Amazon EC2, even offer the automatic addition of servers to scale an application once CPU usage is high.
While scaling an application is pretty straightforward, scaling the database tier is more difficult, particularly when scaling out by adding nodes. Scaling a database in general is no trivial task because of its “statefull” nature (unlike the cloud’s stateless environment), and in the cloud, it is even more difficult.
Cloud applications are often characterized with fluctuating demand (spiking at any moment). Databases need to be able to instantly and automatically scale both in throughput and size to accommodate increasing demand from the application.
When evaluating a database solution, ask yourself how it scales and see if it scales in a way that would be optimal for the needs of your application.
The cloud is all about flexibility in resources – allowing you to add/remove resources to match your needs, with no need for over-provisioning or over-paying to prepare for any future peaks. Elasticity isn’t just about increasing resources when you need to by scaling up or out, but also shrinking those back down when your database is underutilized, to save on costs. Elasticity needs to be supported to accommodate very granular increases in resources so that to gain +0.X more power doesn’t mean you need to commit any pay for a much larger (+XXX) machine.
Understanding these key challenges can help to successfully deploy, run and consume a database in the cloud. Developers who are aware of the dynamic nature of clouds will take extra care protecting their assets and will be able maintain a successful application that is usable, satisfies users and produces revenue.
By Razi Sharir,
Razi Sharir, CEO of Xeround, has more than 20 years of management experience in product/solution development. Prior to Xeround, Razi has led the strategic transition from traditional data centers to cloud computing at BMC Software and the Incubator/Innovation Lab business unit.