Results are in! Our study measuring the different levels of data center air containment

This audio was created using Microsoft Azure Speech Services

In an earlier blog I explained the problem with “percent leakage” in regards to air containment and I proposed some more practical metrics. To quantify these metrics, we set up a test at the Innovation Executive Briefing Center in St. Louis. In this blog, I discuss that study we performed in our cooling lab – with the help of Scott Buell and his awesome team.

Our containment study: The setup

We set up a 2-row pod, including four racks in each row with hot aisle containment (HAC) ducted back to the CRAH.  Although we had a raised floor, we took a hard floor approach where the CRAH supply air flooded the room (no perforated tiles anywhere). This “bowling alley” layout, where the CRAH supply jet crashes against the side of the pod, provided a worst case for our testing. We had 6 temperature sensors attached vertically across each rack. Other sensor arrays were placed in the hot aisle, CRAH return, CRAH supply, and in the cold aisle. We used a power meter for the CRAH and monitored power for all the server simulators. We also used a very sensitive pressure sensor in the hot aisle that gave us an indication of the balance between CRAH supply air and IT airflow. This pressure sensor becomes less useful with less containment, but if you have 100% blanking panels, use aisle doors, and duct the hot or cold aisle, I would recommend using a pressure sensor. All this data was collected with two data loggers and fed back to a workstation.

Removing variables – controlling the test-environment

Our tests started with finding the minimal airflow required to cool the servers in an ideal “airtight” case (yes, with lots of tape). Then we introduced practices that resulted in leakage. In all, we ran 44 cases. The ideal “airtight” case started with a standard HyperPod containment solution with brush strips, doors, hot aisle duct to the CRAH return, and 8 racks at kW densities 7, 2, 5, 9.5, 2, 12, 4, and 3.5. (a total of 45kW with an average of 5.7kW/rack). We used 15 server simulators, 10U each, to simulate the stated densities. Then, to make it airtight, we used cardboard and gaff tape to cover up every single penetration we could see. This took about one week to seal everything, including cable penetrations. Every gap was covered in cardboard and tape except for the server simulators. I don’t recommend this… super time consuming! Measurement is also time consuming because you have to wait 25-100 minutes for the temperatures to reach steady state, for every change you make.

Note that when the test called for missing blanking panels, we color-coded the panels to ensure we consistently removed the same panels. We had three cases: 100% panels installed (186 total panels), 87% installed, and 60% installed.

Setting the benchmark with the ideal ‘airtight’ case

By changing the CRAH fan speed we found the airflow at which there was nearly a zero-pressure difference between the hot and cold aisle (this gives us a good indication that the CRAH and IT airflows are about the same). Then we opened the CRAH chilled water valve to bring the average IT supply air temperature at the rack fronts to 21°C/70°F (our target for all cases). Once we measured our ideal “airtight” case, we were able to benchmark the subsequent “leaky” cases to this baseline. What where the metrics for our ideal airtight baseline?

  • Average IT supply temperature at the rack fronts was 21.1°C/70°F,
  • Maximum temperature was 22.1°C/71.8°F
  • Standard deviation of 0.4°C/0.8°F

For reference, the average CRAH power consumption was 616 watts, chilled water valve 24% open, and CRAH fan speed 36%. With each case hereafter, we started the test with the same valve position and fan speed as the ideal airtight case, and then increased both from there to find the point that brought us closest to the ideal temperature and standard deviation.

The results are in – measured outcomes across 5 cases

Each of the following cases tested questions we had to air containment. Check out our findings.

Case 1 – What is the minimal required containment possible?

  • Used: 60% blanking panels, no hot aisle ducting, pod doors closed, “V” shaped diffuser in front of CRAH
  • Result: average IT supply air temperature was 28.5°C/83.2°F with a standard deviation of 4.1°C/7.4°F.
  • Findings: this standard deviation is almost 10 times higher and the average temperature is 7.6°C/13.6°F higher than the “airtight” case.

Case 2 – What happens when I add the rest of the blanking panels?

  • Used: 100% blanking panels, no hot aisle ducting, pod doors closed, “V” shaped diffuser in front of CRAH 
  • Result: the average temperature decreased to 25.2°C/77.3°F with a standard deviation of 1.3°C/2.4°F
  • Findings: The standard deviation is only about 3 times higher and the average temperature is 4.3°C/7.7°F higher than the “airtight” case. So blanking panels gave us a big step in the right direction.

Case 3 – What if I fully ducted the hot aisle to the CRAH in Case 1?

  • Used: 60% blanking panels, full ducting, pod doors closed, “V” shaped diffuser in front of CRAH 
  • Results: the average temperature decreased to 22.3°C/72.2°F with a standard deviation of 2.3°C/4.1°F
  • Findings: The standard deviation is now over 5 times higher and the average temperature is 1.4°C/2.5°F higher than the “airtight” case. What this seems to indicate (for our test) is that 100% blanking panels reduce the temperature variation, while ducting reduces the average temperature.

Case 4 – What’s the impact of diffusing the CRAH supply air?

  • Used: 60% blanking panels, full ducting, pod doors closed, diffuser in front of CRAH
  • Results: the average temperature increased to 23.2°C/73.7°F with a standard deviation of 2.9°C/5.2°F
  • Findings: the standard deviation is now almost 7 times higher and the average temperature is 2.3°C/4.1°F higher than the “airtight” case. This tells us that the diffuser helps improve both the variation and average temperature.

Case 5 – Combine all best practices except diffuser?

  • Used: 100% blanking panels, full ducting, pod doors closed
  • Results: doing this resulted in an average temperature 21.3°C/70.3°F with a standard deviation of 0.6°C/1.1°F
  • Findings: The standard deviation is now 1.5 times higher and the average temperature is 0.4°C/0.7°F higher than the “airtight” case.

At the start of our testing, we assumed that air jets had little impact on temperatures when everything was well-sealed. Unfortunately, this was a bad assumption and we didn’t measure the impact of this in Case 5. Had we inserted a diffuser in the ideal airtight case and Case 5, I believe we would’ve seen an improvement in the standard deviation and average temperature.

Drawing conclusions from the 5 air containment cases

The table below summarizes the values for each experiment:

The table includes the maximum rack temperatures for each case. Maximum temperature is obviously important if you care about hot spots. Chances are that if you’re experiencing hot spots, you’re missing blanking panels. So, what does this tell us? It tells us that blanking panels are a huge driver in reducing maximum inlet temperatures. Consider Case 2 and 3. Case 2 has no return duct, but has 100% blanking panels. Case 3 has a fully ducted return but missing only 75 of the 186 panels. Yet, Case 2 (un-ducted) has a lower maximum temperature than Case 3. But notice that the average for Case 3 is lower than Case 2, thanks to the fully ducted return.

In comparing the effect of the diffuser (Case 3 and 4), we see that the diffuser has a significant impact on maximum temperature as well. But this effect will be highly variable in your data center. Finally, in comparing the ideal case with Case 5, we can see that going crazy with gaff tape and cardboard is questionable in terms of the return you get on standard deviation, average, and maximum inlet temperature improvements. Also remember that this was done with a worst case “bowling alley” layout, therefore the effect of jets in other layouts may not be as pronounced.

So, what’s next? Re-read this blog again to be sure to digest everything. Then, I’m following up with a summary of best practices you can use for your data center, based on the data and findings from this project. Please leave a comment below and let me know what you think of our containment test and the results we documented! Or check out other blog posts from the Data Center Science Center team.

Tags: , ,