This audio was created using Microsoft Azure Speech Services
Have you ever looked in your home’s circuit breaker panel and counted all the branch breakers in there? You may have 20 breaker positions and most of them are 15 and 20 amp breakers. If you add up the total number of amp ratings you end up with something like 3 to 4 times the amount of current than that of the main breaker rating (for example a 100A main in a 20 position panel).
This is similar to what happens to server power supply units (PSU) when you distribute them within a typical IT rack in a data center. If you add all up the power ratings (in watts) of the individual PSUs, you find that the total PSU rated power capacity is on the order of 3-6 times that of the total power capacity of the circuit(s) feeding the rack.
Why does this happen? In my research into this phenomenon, I found there are two possible reasons why this happens.
The first is the diversity of loads, the same effect you have in your circuit breaker panel. Diversity of loads basically means that electrical loads don’t always draw the same amount of power. Some loads like constant speed fans may run all the time at their maximum power so their diversity is 100%. But most loads fluctuate. A compressor in a cooling system may only draw its maximum power for one hour per day, so its diversity is 1/24 or 4.2%. In each case the branch breaker needs to be sized to support the maximum power draw of its load. However, the randomness of loads peaking at different times allows a main circuit breaker to be sized much smaller than the sum of the individual branch breakers.
In my analogy, the branch breakers are like the PSUs, and like the branch breakers, the PSUs have loads, in this case servers. Every PSU supports a server (forget blade servers for now). Most servers fluctuate over time (it may consume 200 watts for 8 hours but peak to 400 watts for 1 hour). Server vendors need to account for this diversity by providing enough PSU capacity to support this peak power.
I wanted to better understand just how “peaky” servers are and the impact that had on the PSU rating, so I analyzed server power data from Spec.org. A disclaimer about this data is that server vendors who submit their server data are likely to configure their servers to optimize the performance per watt. But for this analysis that’s perfect because I can assume that an efficient PSU was chosen.
I took data (server idle watts, server max watts, and PSU watts) from 26 servers submitted between Q1 2016 and Q4 2014. I also compared this to older server data submitted between Q4 2008 and Q3 2008 (20 servers). The first thing I concluded is that the older servers had a peak to idle ratio of only 1.8 (average) compared to 4.8 average for the new servers. This is not a surprise given the proliferation of power management schemes to reduce energy use. Then I looked at the PSU power rating (all at 1N redundancy) and found that the new servers had a PSU rating that was on average 3.5 times that of the maximum server power consumption. This was slightly higher than the average for the older servers which was 2.9. One would expect that these ratios would be closer to 1.5 if server vendors were sizing PSUs to the maximum server load and maximum PSU efficiency. Based on this data, my conclusion is that while the diversity of server power consumption is greater for new servers, it doesn’t suggest that this is the main reason for oversized PSUs ratings.
In my next blog, I’ll discuss the second possible reason for this oversizing. In the meantime, for more information on this analysis, see White Paper 228, Analysis of Data Center Architectures Supporting Open Compute Project (OCP).