Management by AI
Behind any cloud, hosted environment or enterprise computing environment are data centers with tens of thousands of servers, racks upon racks of networking equipment, and supporting critical infrastructure, from power distribution to thermal management.
The scale, complexity and required optimization of these modern data centers necessitate “Management by AI” as they increasingly cannot be planned and managed with traditional rules and heuristics. AI leads to many direct, and a few unexpected, benefits: The massive amount and variety of available data, from environmental to critical infrastructure to IT systems and applications, when synthesized and analyzed by an AI system, will provide the best outcomes for ever increasing availability and optimization, helping to address SLAs and minimize operating expenses.
Numerous factors are contributing to the need for AI in data centers:
- Efficiency and environmental impact: According to a U.S. Department of Energy report, a data center uses up to 50 times more energy per square foot than a typical commercial building, and as an industry, data centers consume more than 2% of all electricity in the U.S. The industry has faced undeniable scrutiny over its energy footprint; coupled with the costs of consumption, operators are addressing efficiency in ever more creative and complex ways.
- Data center consolidation: Data centers absolutely benefit from economies of scale, and whether corporate data centers are consolidated or moved to colocation facilities, the result is ever larger facilities, with density and power usage to match.
- Growth of colocation providers: Colocation providers, such as Equinix and Digital Realty, for whom availability, efficiency and reducing costs are paramount, are growing five times faster than the overall market, according to a recent 451 Group report. These providers, with the necessary scale of their facilities and their efficiency-driven business models, stand to disproportionately benefit from, and are thus driving, AI.
- Edge computing: The rise of Edge data centers- smaller data centers often geographically dispersed – allows computing and data to be optimally placed. Rather than being stand-alone entities, these Edge nodes combine with central data centers or cloud computing to form a larger, cooperative computing fabric. This rich topology provides numerous inputs and controls for optimization and availability, which again are best managed by AI.
There are several areas where AI is being researched and applied in data centers today:
- Optimizing availability by accurately predicting future application behavior down to the rack and server; workloads are pre-emptively moved within or across data centers based on future power, thermal or IT equipment behavior.
- Optimizing energy usage by managing the numerous types of cooling, across room, row and rack, with great precision. It is not uncommon for different cooling systems to conflict with each other; with its continual feedback and optimization algorithms, AI provides an ideal mechanism for managing this complexity. Some of the best and most intriguing examples use weather algorithms to predict and address hot spots in the data center.
- Multi-variate preventative maintenance, delving into the component level within equipment to predict failure.
- Optimizing IT equipment placement by forecasting future states of the data center rather than simply the current configuration.
- Intelligently Managing alarms and alerts by filtering and prioritizing significant events. A common problem in data centers is dealing with chained alerts, making it difficult to address the root cause. AI, when coupled with Change of Rate, deviation or similar algorithms provides an ideal mechanism to identify critical alerts.
Although AI has numerous benefits and is a certain trend in data centers, two points are critical for a successful implementation:
- AI thrives on rich and large data streams; the right systems must be in place to collect and aggregate this data across the key elements in the data center, from Critical Infrastructure to IT Systems to Applications.
- Expectations need to be set for the outcomes of AI, especially regarding autonomous control. One of the largest benefits of AI is real-time analysis on rich and huge data streams; delaying action can negate many of the benefits an AI system provides. This is not an issue of relinquishing control but rather putting the appropriate management systems in place to achieve the full benefit from AI while still setting boundaries and limits.
Data centers present an ideal use case for AI: Complex, energy intensive and critical, with a very large set of inputs and control points that can only be properly managed through an automated system. With ever-evolving innovations in the data center, from Application Performance Management linked with physical infrastructure to closely linked multi-data center topologies, the need for and benefit of AI will only increase in the coming years.
By Enzo Greco