“Management by AI”: Analytics in the Data Center

Management by AI

Behind any cloud, hosted environment or enterprise computing environment are data centers with tens of thousands of servers, racks upon racks of networking equipment, and supporting critical infrastructure, from power distribution to thermal management.

AI Management

The scale, complexity and required optimization of these modern data centers necessitate “Management by AI” as they increasingly cannot be planned and managed with traditional rules and heuristics. AI leads to many direct, and a few unexpected, benefits: The massive amount and variety of available data, from environmental to critical infrastructure to IT systems and applications, when synthesized and analyzed by an AI system, will provide the best outcomes for ever increasing availability and optimization, helping to address SLAs and minimize operating expenses.

Numerous factors are contributing to the need for AI in data centers:

  • Efficiency and environmental impact: According to a U.S. Department of Energy report, a data center uses up to 50 times more energy per square foot than a typical commercial building, and as an industry, data centers consume more than 2% of all electricity in the U.S. The industry has faced undeniable scrutiny over its energy footprint; coupled with the costs of consumption, operators are addressing efficiency in ever more creative and complex ways.
  • Data center consolidation: Data centers absolutely benefit from economies of scale, and whether corporate data centers are consolidated or moved to colocation facilities, the result is ever larger facilities, with density and power usage to match.
  • Growth of colocation providers: Colocation providers, such as Equinix and Digital Realty, for whom availability, efficiency and reducing costs are paramount, are growing five times faster than the overall market, according to a recent 451 Group report. These providers, with the necessary scale of their facilities and their efficiency-driven business models, stand to disproportionately benefit from, and are thus driving, AI.
  • Edge computing: The rise of Edge data centers- smaller data centers often geographically dispersed – allows computing and data to be optimally placed. Rather than being stand-alone entities, these Edge nodes combine with central data centers or cloud computing to form a larger, cooperative computing fabric. This rich topology provides numerous inputs and controls for optimization and availability, which again are best managed by AI.

There are several areas where AI is being researched and applied in data centers today:

  • Optimizing availability by accurately predicting future application behavior down to the rack and server; workloads are pre-emptively moved within or across data centers based on future power, thermal or IT equipment behavior.
  • Optimizing energy usage by managing the numerous types of cooling, across room, row and rack, with great precision. It is not uncommon for different cooling systems to conflict with each other; with its continual feedback and optimization algorithms, AI provides an ideal mechanism for managing this complexity. Some of the best and most intriguing examples use weather algorithms to predict and address hot spots in the data center.
  • Multi-variate preventative maintenance, delving into the component level within equipment to predict failure.
  • Optimizing IT equipment placement by forecasting future states of the data center rather than simply the current configuration.
  • Intelligently Managing alarms and alerts by filtering and prioritizing significant events. A common problem in data centers is dealing with chained alerts, making it difficult to address the root cause. AI, when coupled with Change of Rate, deviation or similar algorithms provides an ideal mechanism to identify critical alerts.

Although AI has numerous benefits and is a certain trend in data centers, two points are critical for a successful implementation:

  • AI thrives on rich and large data streams; the right systems must be in place to collect and aggregate this data across the key elements in the data center, from Critical Infrastructure to IT Systems to Applications.
  • Expectations need to be set for the outcomes of AI, especially regarding autonomous control. One of the largest benefits of AI is real-time analysis on rich and huge data streams; delaying action can negate many of the benefits an AI system provides. This is not an issue of relinquishing control but rather putting the appropriate management systems in place to achieve the full benefit from AI while still setting boundaries and limits.

Data centers present an ideal use case for AI: Complex, energy intensive and critical, with a very large set of inputs and control points that can only be properly managed through an automated system. With ever-evolving innovations in the data center, from Application Performance Management linked with physical infrastructure to closely linked multi-data center topologies, the need for and benefit of AI will only increase in the coming years.

By Enzo Greco

Dr. Mike Lloyd

How to Mitigate Security Risks in the Cloud

How to Mitigate Security Risks in the Cloud Enterprises continue to spend billions annually on security technology, yet cyber breaches continue to come fast and furious. So what exactly is going on here? Why are ...
Darach Beirne

Improve the Customer Experience by Connecting IT Silos

Connecting IT Silos Customer experience (CX) is a top priority for businesses across industries. The interactions and experiences customers have with a business throughout their entire journey – from first contact to becoming a happy ...
James Corbishly

Addressing Teams Sprawl in the Remote Workspace

Teams Sprawl in the Remote Workspace As working from home has become the new everyday norm, with more employers embracing the remote-work model as a new and likely permanent fixture of the employment world, there ...
Doug Hazelman Cloudberry

Managing an Increasingly Complex IT Environment

Managing Complex IT Environments The hybrid work model is here to stay—at least for the time being. That’s how things feel in these still uncertain times. This new way of work that has evolved from ...
Derrek Schutman

Implementing Digital Capabilities Successfully to Boost NPS and Maximize Value Realization

Implementing Digital Capabilities Successfully Building robust digital capabilities can deliver huge benefits to Digital Service Providers (DSPs). A recent TMForum survey shows that building digital capabilities (including digitization of customer experience and operations), is the ...

CLOUD MONITORING

The CloudTweaks technology lists will include updated resources to leading services from around the globe. Examples include leading IT Monitoring Services, Bootcamps, VPNs, CDNs, Reseller Programs and much more...

  • Opsview

    Opsview

    Opsview is a global privately held IT Systems Management software company whose core product, Opsview Enterprise was released in 2009. The company has offices in the UK and USA, boasting some 35,000 corporate clients. Their prominent clients include Cisco, MIT, Allianz, NewVoiceMedia, Active Network, and University of Surrey.

  • Nagios

    Nagios

    Nagios is one of the leading vendors of IT monitoring and management tools offering cloud monitoring capabilities for AWS, EC2 (Elastic Compute Cloud) and S3 (Simple Storage Service). Their products include infrastructure, server, and network monitoring solutions like Nagios XI, Nagios Log Server, and Nagios Network Analyzer.

  • Datadog

    DataDog

    DataDog is a startup based out of New York which secured $31 Million in series C funding. They are quickly making a name for themselves and have a truly impressive client list with the likes of Adobe, Salesforce, HP, Facebook and many others.

  • Sematext Logo

    Sematext

    Sematext bridges the gap between performance monitoring, real user monitoring, transaction tracing, and logs. Sematext all-in-one monitoring platform gives businesses full-stack visibility by exposing logs, metrics, and traces through a single Cloud or On-Premise solution. Sematext helps smart DevOps teams move faster.