For years we’ve been hearing that human error is the number one cause of data center downtime, or pretty much any IT failure for that matter. Yet many companies put precious little time, thought or effort into ensuring they have a sound data center Operations and Maintenance (O&M) program – one that effectively minimizes the effects of human error. Perhaps a program had been developed at some point, such as when the data center was first built, but was never revisited to measure or improve its effectiveness.
If any of that rings true to your organization, this is likely a good time to examine your data center O&M program to ensure it conforms to company expectations for data center uptime and your risk profile.
Schneider Electric has put considerable thought into what constitutes efficient, effective data center operations and, in the process has come up with a white paper listing a number of mistakes to avoid. For the most part, the mistakes fall into two categories; one being people and the other process. In this post, I’ll cover some of the biggest people-related mistakes and will follow up soon with another covering process and procedure mistakes to avoid.
The first mistake is not including the O&M team in the design of your data center. In Schneider Electric’s experience, when the operations and maintenance professionals are excluded from the data center design process, avoidable modifications and repairs are often incurred after the facility is built. Some real-world examples are electrical distribution systems that are discovered not to be concurrently maintainable, or HVAC equipment that has been installed in such a way that maintenance is difficult or unnecessarily expensive. Such mistakes can be easily avoided by allowing the subject matter experts who will operate the data center to be included in the design process.
Having an inadequate number of staff on hand to operate the data center is another common mistake. A data center is unlike a generic office building facility. Given the importance of what goes on in the data center to the company as a whole, you need a workforce that can properly maintain it and, most importantly, be able to respond in an emergency. You also need to choose these folks carefully, as they need to have a mix of skills, including the ability to manage vendors, communicate effectively and of course sound technical chops.
Finding such people is only the start, however. You’ve also got to continually support them, including providing ongoing training and career development opportunities. That will go a long way toward creating a positive work environment, which leads to improved employee retention – an important consideration in a mission critical data center. While training may be one of the first areas cut when budgets get tight, such actions are shortsighted. The cost of a typical training and development program is more than offset by increased data center uptime, lower maintenance costs and decreased employee turnover.
For more tips on developing an effective operations program, check out the Schneider Electric white paper, “Top 10 Mistakes in Data Center Operations: Operating Efficient and Effective Data Centers.”