“Management by AI”: Analytics in the Data Center

Management by AI

Behind any cloud, hosted environment or enterprise computing environment are data centers with tens of thousands of servers, racks upon racks of networking equipment, and supporting critical infrastructure, from power distribution to thermal management.

AI Management

The scale, complexity and required optimization of these modern data centers necessitate “Management by AI” as they increasingly cannot be planned and managed with traditional rules and heuristics. AI leads to many direct, and a few unexpected, benefits: The massive amount and variety of available data, from environmental to critical infrastructure to IT systems and applications, when synthesized and analyzed by an AI system, will provide the best outcomes for ever increasing availability and optimization, helping to address SLAs and minimize operating expenses.

Numerous factors are contributing to the need for AI in data centers:

  • Efficiency and environmental impact: According to a U.S. Department of Energy report, a data center uses up to 50 times more energy per square foot than a typical commercial building, and as an industry, data centers consume more than 2% of all electricity in the U.S. The industry has faced undeniable scrutiny over its energy footprint; coupled with the costs of consumption, operators are addressing efficiency in ever more creative and complex ways.
  • Data center consolidation: Data centers absolutely benefit from economies of scale, and whether corporate data centers are consolidated or moved to colocation facilities, the result is ever larger facilities, with density and power usage to match.
  • Growth of colocation providers: Colocation providers, such as Equinix and Digital Realty, for whom availability, efficiency and reducing costs are paramount, are growing five times faster than the overall market, according to a recent 451 Group report. These providers, with the necessary scale of their facilities and their efficiency-driven business models, stand to disproportionately benefit from, and are thus driving, AI.
  • Edge computing: The rise of Edge data centers- smaller data centers often geographically dispersed – allows computing and data to be optimally placed. Rather than being stand-alone entities, these Edge nodes combine with central data centers or cloud computing to form a larger, cooperative computing fabric. This rich topology provides numerous inputs and controls for optimization and availability, which again are best managed by AI.

There are several areas where AI is being researched and applied in data centers today:

  • Optimizing availability by accurately predicting future application behavior down to the rack and server; workloads are pre-emptively moved within or across data centers based on future power, thermal or IT equipment behavior.
  • Optimizing energy usage by managing the numerous types of cooling, across room, row and rack, with great precision. It is not uncommon for different cooling systems to conflict with each other; with its continual feedback and optimization algorithms, AI provides an ideal mechanism for managing this complexity. Some of the best and most intriguing examples use weather algorithms to predict and address hot spots in the data center.
  • Multi-variate preventative maintenance, delving into the component level within equipment to predict failure.
  • Optimizing IT equipment placement by forecasting future states of the data center rather than simply the current configuration.
  • Intelligently Managing alarms and alerts by filtering and prioritizing significant events. A common problem in data centers is dealing with chained alerts, making it difficult to address the root cause. AI, when coupled with Change of Rate, deviation or similar algorithms provides an ideal mechanism to identify critical alerts.

Although AI has numerous benefits and is a certain trend in data centers, two points are critical for a successful implementation:

  • AI thrives on rich and large data streams; the right systems must be in place to collect and aggregate this data across the key elements in the data center, from Critical Infrastructure to IT Systems to Applications.
  • Expectations need to be set for the outcomes of AI, especially regarding autonomous control. One of the largest benefits of AI is real-time analysis on rich and huge data streams; delaying action can negate many of the benefits an AI system provides. This is not an issue of relinquishing control but rather putting the appropriate management systems in place to achieve the full benefit from AI while still setting boundaries and limits.

Data centers present an ideal use case for AI: Complex, energy intensive and critical, with a very large set of inputs and control points that can only be properly managed through an automated system. With ever-evolving innovations in the data center, from Application Performance Management linked with physical infrastructure to closely linked multi-data center topologies, the need for and benefit of AI will only increase in the coming years.

By Enzo Greco

Cloud For Dummies.png
Growing Up.png
Disaster Plan.png
David Fletcher Blown Image
Jen
VoIP and PBX Phone Systems The cloud is already providing businesses with such a range of advanced tools and services, optimizing communication across channels, improving global cooperation, and supporting collaboration between teammates and partners both ...
Rakesh Soni
Multi-tenant clouds are becoming more popular than ever because they're incredibly cost effective and easy to set up. If you're considering switching your business over to a multi-tenant cloud platform, this article is for you ...
Gilad David Maayan
What Is Application Dependency Mapping? Modern software development teams use fast-paced DevOps work processes. However, the complexity of modern software applications often gets in the way. A typical enterprise software project has thousands of components, ...
Mark Ardito
‘Legacy systems’ often get a bit of a rough time in the IT community. But perhaps this is unfair. After all, in many cases you’re talking about software platforms that have lasted and been effective ...
Gilad David Maayan
What Is SSPM? SaaS Security Posture Management (SSPM) is a set of security tools that an organization’s security team can use to gain visibility and manage security for their Software as a Service (SaaS) applications ...
  • Plural Site

    Pluralsite

    Pluralsight provides online courses on popular programming languages and developer tools. Other courses cover fields such as IT security best practices, server infrastructure, and virtualization.

  • Isc2

    ISC2

    (ISC)² provides IT training, certifications, and exams that run online, on your premises, or in classrooms. Self-study resources are available. You can also train groups of 10 or more of your employees. If you want a job in cybersecurity, this is the route to take.

  • App Academy

    App Academy

    Immersive software engineering programs. No experience required. Pay $0 until you're hired. Join an online info session to learn more

  • Cybrary

    Cybrary

    CYBRARY Open source Cyber Security learning. Free for everyone, forever. The world's largest cyber security community. Cybrary provides free IT training and paid IT certificates. Courses for beginners, intermediates, and advanced users are available.