A sound disaster avoidance plan is crucial to ensuring the continuous 24x7x365 operation of a mission critical data center. The plan needs to cover numerous areas (as outlined in the Schneider Electric white paper, A Practical Guide to Disaster Avoidance in Mission-Critical Facilities). For this post, we want to focus on some of the aspects of electrical design and maintenance that go into ensuring continuous data center operation.
Keep in mind that disasters aren’t just of the natural variety – floods, earthquakes the like. Many disasters are man-made, often the result of human error and avoidable problems that cascade into a full-fledged disaster. That is certainly true of the types of failures that can affect data center power systems.
Take the care and maintenance of your generators, for example. To ensure your generator is available during a power outage, proper care and maintenance is crucial. While starting the generator weekly may give you assurance that it is functioning, it’s not enough. In fact, such weekly tests with no load on the generator can result in deposits building up that prevent the generator from developing full power under load – a condition known as “wet stacking.” To prevent wet stacking, generators should periodically be run for 2 to 4 hours under full load, which allows the deposits to be blown out. The frequency of such load tests depends on the specific generator, so check with the manufacturer or a specialist for recommendations.
Other generator considerations include ensuring oil is replenished regularly, since generators consume oil during extended runs. Coolant is another issue, as most generators have low-coolant alarms that prevent them from starting if the coolant level is too low. A good rule of thumb is to have enough oil and coolant on hand for at least one week of constant duty.
Another power system component that requires ongoing maintenance are the automatic transfer switches that sense when utility power fails and start the generator. These switches contain parts and connections that can and will fail, therefore ongoing maintenance is critical.
Uninterruptible power supplies (UPS) are likewise essential at mission-critical facilities and they all have a component in common that can fail: batteries. Batteries have a finite life span and must be tested regularly. Battery monitors are a good idea, as they help identify problems before they become critical, thus increasing reliability.
Numerous components may be involved in actually bringing power to the data center floor, including power distribution units, remote power panels, distribution panels and various power cables and power strips. This is no place to skimp on quality, because the most state-of-the-art generator, utility feeds and UPSs will do no good if the final connection hinges on inadequate cabling, breakers and distribution systems.
For example, the breakers supporting your servers should be tested bolt-in breakers. While the snap-in variety used in residential applications are cheaper and easier to install, they are far less reliable – we’ve seen failure rates of 20% to 50%. In short, don’t leave millions of dollars in infrastructure at the mercy of a $4 power strip with a 10-cent push-out circuit breaker and 20-cent switch.
For more tips on how to avoid disasters in your data center, check out the Schneider Electric white paper, A Practical Guide to Disaster Avoidance in Mission-Critical Facilities.