Data Center Lifecycle Principles: Part 2 of 2 – Continuous Improvement

April 26, 2013

As examined in my last post, a key principle in the lifecycle approach to data centers is to design facilities with change in mind. However, once a data center is in operation, you don’t wait until problems occur to initiate change; you follow another key principle: continuous improvement.

Continuous improvement is a broad term that spans concepts such as Six Sigma, but let’s leave discussion of formal methodologies aside. Let’s concentrate on the tools and tactics one can use to continually improve the efficiency and reliability of a data center.

First of all, you need some form or measurement and analysis. In a data center environment, the best set of tools for that is data center infrastructure management (DCIM) software. With DCIM, an end-user organization or a service provider can model the functioning of a data center, test would-be improvements, and proactively monitor conditions.

DCIM gives us the means to measure, but you still need some sort of process—some sort of framework—for assessing improvements. Again, leaving formal methodologies aside, the simplest framework for continuous improvement is an audit and upgrade strategy.

There should be regularity to audit or assessment events, though the exact timetable might vary by facility. Basically, you are establishing a timetable for periodically assessing efficiency and reliability issues, followed by changes, with ongoing monitoring of progress.

For some facilities, an audit might be done every six months, or even quarterly, while others might use other factors, such as a major installation of new IT assets, or a new virtualization project, as triggers. At Schneider Electric, audit services and energy assessment services have become an important component of our offerings to help companies succeed with data center lifecycle initiatives.

One key point to remember about audits: they don’t necessarily result in a call for new equipment. An audit should be equipment and vendor neutral, looking as much to changes in operational practices as to new gear as a means to improvement.

For example, in one client engagement for Schneider Electric, we were brought in to improve the efficiency of a relatively new data center inIndiathat had been operating for several months, but already begun to have problems with efficiency. An audit revealed that the main issue was the arrangement of the IT assets in relation to the data center physical infrastructure. By reconfiguring the IT assets to better match the infrastructure, efficiency improved significantly.

There are many other types of procedural changes than an audit might reveal. One common weakness is that data centers are kept at a lower temperature than needed for maximum efficiency, or at too constant of a temperature at all hours, even when the IT load doesn’t require it. Air flow issues are another area that benefits from assessment, even if the resulting “upgrade” is more about reconfiguration than capital expenditure.

Audits need to be run by experienced services people with deep data center infrastructure expertise, but the measurement, modeling and analysis tools also are essential. Recommendations need to be based on solid numbers, not gut feel. That’s why DCIM tools are important to issues like pinpointing how to best eliminate a hot spot, or how much power protection is needed given a likely jump in IT load.

The bottom line is that to give a data center a long, efficient, and reliable life, you have to establish a foundation for continuous improvement. Don’t wait until problems flare up to find fixes—implement continuous improvement tools and tactics instead.

Tags: DCIM