Upgrade legacy data centers for AI workloads with RDHx liquid cooling

Liquid cooling

It wasn’t that long ago that you operated a data center and you could plan for 3, 4, or many more IT refresh cycles without any major power or cooling updates. We are now in a new era of accelerated compute AI where power densities are growing fast from an average of around 10 kW a per rack for CPU based cloud servers a few years ago to over 140 kW per rack for the new NVIDIA NVL 72 GB300 (72 GPUs) for AI. 

However, these systems represent the leading edge of accelerated compute AI performance and cost, which are well suited for the most advanced model training or fully autonomous agentic AI inference. They are also liquid cooled.  

We are at a point in time where enterprise and colocation data center operators are deploying working AI inference models for product design and development, improved customer experience, and increased employee productivity and innovation. These AI models can even operate high compute intensive multiple modalities (image to text, text to video, etic). Companies deploying inference models ideally want them to be efficient, so the models should be compressed and tuned to “fit” the application. This allows rightsizing or optimizing the accelerated compute IT stacks to operate at a lower kW per rack.

AI inference cooling solutions

There’s the question of whether the AI inference servers are air cooled or liquid cooled. The good news is many of these servers come in air-cooled versions, and cooling solutions exist for air-cooled servers that can support densities up to 72 kW per rack. The challenge for existing data centers is retrofitting their existing cooling system to support these “mid-density” AI racks.

The solution performance depends on which type of chiller and outdoor heat rejection systems are in place. Popular outdoor chilled water heat rejection systems include cooling towers,  dry coolers (often using glycol), or adiabatic dry coolers. 

Retrofitting AI servers with an existing outdoor chilled water heat rejection system

For air-cooled servers in a data center that has an outdoor chilled water system, you can use a “bolt on solution” called a rear door heat exchanger (RDHx). The RDHx resembles a large, high-tech radiator and is mounted on the rear door of a server rack. As the hot air from the servers passes over the coils, the heat from the air is transferred to the chilled water in the coils. This warmer water flows out of the RDHx and returns to your building or dedicated data center heat rejection system to be re-chilled. Depending on the size and type of the heat exchanger, RDHx like the Motivair ChilledDoor® can cool up to 72 kW per rack. (Schneider Electric acquired a controlling interest in Motivair earlier this year.) RDHxs are efficient because they cool equipment at the source, reducing the need for energy-intensive, room-level fans.

For multiple racks of accelerated AI compute fitted with RDHxs, it convenient to add a liquid-to-liquid coolant distribution unit (CDU) either in each individual rack, or a larger CDU supporting multiple racks. The CDU is a heat exchanger with pumps that creates isolated liquid loops. The primary CDU loop connects to the facility’s chilled water system and circulates coolant to the RDHx. The secondary loop has the cooled liquid circulating to the RDHx to reject hot air across its coils, warming the liquid in its closed loop and sending it to the CDU to cool. 

Liquid cooled servers without an existing chilled water system

As some servers are only available in liquid-cooled versions, but you don’t want or need to convert to all liquid-cooled servers because there is an interim step. Designed for use in air-cooled data centers where outdoor chilled water heat rejection system is unavailable, Motivair’s Heat Dissipation Unit (HDU) allows you to deploy liquid-cooled servers in your data center environment without impacting your infrastructure – individual racks or for multiple racks. Unlike a CDU that rejects heat from the server rack to the chilled water loop of the building, an HDU rejects the server heat from the server rack to the white space where building air conditioning would then remove that heat. The capacity of the HDU is up to 150 KW per rack.  However, the HDU does put an extra burden on the data center’s air-cooling system and will increase the data center’s overall Power Usage Effectiveness (PUE). The data center’s air-cooling system may not have excess capacity to be able to support this increased heat.  

RDHx for data center retrofit cooling

Retrofitting existing racks of air-cooled servers with RDHxs is less costly and less disruptive than moving to liquid-cooled servers and adding manifolds, CDUs and special chillers, or adding an HDU. Traditional air cooling is 30-60% less efficient than cooling with an RDHx. This means you can roughly cool double the heat using the exact same heat rejection system. RDHxs are efficient because they cool equipment at the source, reducing the need for energy-intensive, room-level cooling systems. They can also use warmer inlet water temperatures, which support free cooling initiatives and improve the facility’s PUE. 

In summary, drop-in RDHx solutions can support high-density, air-cooled AI racks of servers. RDHxs are ideal for managing high-density, air-cooled server workloads, such as those generated by AI inference without completely overhauling the facility’s cooling infrastructure. RDHxs can be installed on all variations of standard IT racks, the ChilledDoor, for example, is also Open19 and OCP compliant, making it an effective solution for any existing data center infrastructure.  

To learn more about Schneider Electric’s AI data center cooling solutions, visit our web site.

Add a comment

All fields are required.