When It Comes to Data Center Design, Never Make Assumptions

This audio was created using Microsoft Azure Speech Services

“Never assume,” my old boss used to say to anyone who would listen, “it makes an a** out  of you and me.” It might be an old adage, but it’s well suited to the design assumptions which are pervasive in many data centers around today. When you consider the billions which have been invested in advancing the efficiency and capabilities of IT in recent years, it beggars belief that the processes by which some data center designs are arrived at are so, well, Neanderthal.

The urgent need to modernize the design process was never more apparent to me than during two separate meetings I had in a recent visit to the US. The cultural divide between IT and facilities was underlined not just in the difference in age and experience, but also in the ability to think quickly and adopt new ideas.

On the one hand, in a meeting with a major online business which serves a primarily youthful demographic, I definitely felt like the accidental grown up! Far more experienced in infrastructure than my 20-something year-old counterpart, who seemed to have forgotten more about IT than I’ll ever know. But coming from that “side,” the focus was all about IT – how to run it effectively at the lowest possible cost.

On the other hand, there was a meeting with a long established facilities company in which I was not only the youngest participant, but also felt like the least qualified to make input. Here’s a bunch of guys who know everything about power and cooling, plant and equipment, capex and logistics, yet they failed to demonstrate the least bit of interest in IT.

And that’s the rub. Despite all the experience, the high cost of energy and infrastructure and the low cost of servers, their capacity calculations are simply to provide a big, fat safety net for the servers. As you probably know, before a server comes into the data center, a lot of safety margin has been built-in by the manufacturer – by their CPU, memory, hard drive and motherboard teams. So while its true maximum power draw might be 380 watts, the nameplate says 1000 watts. And that’s the number the facility designer starts with.

Then, before the server is commissioned, they add another 25%, making the power capacity reservation 1250W. Furthermore, they then de-rate the breakers by 20%: So now you have 1560W reserved for the server despite the fact that the maximum power it’s ever going to consume is 380W. It’s now over provisioned by 400%!

Higher-up the power chain you may also put a max load limit of 70% on your UPS, reducing its capacity from, e.g., 10KW to 7KW. So by applying all the design assumptions that have been made, 4 servers can be safely connected to this UPS (7KW divided by 1560W) rather than 18 servers (7KW divided by 380W) or with a full utilization policy 26 servers (10KW divided by 380W). Now that’s what you call waste!

And, of course, it’s complete madness to our online buddies. Their servers do not directly generate any revenue, so both the cost of the IT and the operating expenses are overhead. The lower they can drive these costs, the more profit they derive.

So what would happen in one of the data centers described above, if they suddenly were endowed with the capability to look not only at the cost of powering and cooling the server load, but to investigate device utilization as well?

You can imagine a few heads would roll if the waste, which is an inherent factor of safety-net over-provisioning, together with the waste which results from an under-utilized load, was brought to the attention of corporate management. The sad fact is that this is an all-too-common scenario in many of today’s data centers, despite PUE’s which would indicate well run and efficient facilities.

Simply put, the data center has to be designed and built with the processing requirement at the very heart of the specification. The actual power and cooling needs of the IT load that is, and not some design assumption based upon misleading information such as average rack power density or nameplate-plus specification. And adding in utilization data lets you to see just how big that server estate needs to be.

Tags: , , , , , ,