Answering the top FAQs on AI and liquid cooling

This audio was created using Microsoft Azure Speech Services

Data centers traditionally have relied on air-cooled servers to deliver compute power, but the acceleration of artificial intelligence (AI) deployments is fueling a trend toward liquid cooling. AI training requires substantial increases in compute, compelling chip developers to boost thermal design power (TDP) – the maximum amount of heat generated by a processor.

As chips get hotter, air cooling eventually becomes inadequate, making liquid cooling the only option for AI servers. Liquid cooling isn’t new; it has been used in High Performance Computing (HPC) for many years. However, the switch to liquid cooling in traditional data center environments will require some rethinking by operators as they look to accommodate AI demand. 

data center sustainability

In my role within our CTO Office exploring new technologies, I’m fielding many questions on expanding the use of liquid cooling in data centers. See the top FAQs and my answers below.

As AI adoption accelerates, how urgent is the need for liquid-cooled servers?

Growing enterprise interest in deploying AI applications is accelerating demand for liquid-cooled servers. Liquids do a better job than air dissipating heat, and many new chips being deployed for AI solutions cannot be cooled with air. As such, data center operators must quickly build the infrastructure to support liquid-cooled servers. While liquid cooling has been discussed as an option for years, AI is turning it from an option to a necessity.

What are the forms of liquid cooling being deployed?

The most popular architecture currently involves attaching a cold plate to a component, such as a processor or memory. Treated water sent to the plate absorbs heat and transfers it through a Technology Cooling System (TCS) loop and, in turn, to a coolant distribution unit (CDU) with a heat exchanger that disperses the heat. Although a water-based fluid is the most prevalent, this architecture can also use an engineered fluid that boils within the cold plate and is then condensed within the CDU. Cold plates remove most of the server’s heat via a liquid, but air cooling is still needed. 

Another method is direct immersion cooling, which involves placing a server in a chassis or tank filled with dielectric liquid to cool the entire server. The dielectric fluid, usually a synthetic oil, removes all the heat. Although this method is currently not as popular as cold plate, it brings some advantages of requiring no airflow thru the server and high thermal stability across the whole server.

How can existing data centers support the deployment of high-density AI servers with liquid cooling?

Connecting a CDU to an existing facility water loop is preferred, allowing the heat from the TCS loop to go directly into the facility cooling system. If a connection to a facility water system is not available, a data center operator can deploy a CDU that rejects liquid to air. With new data center construction, operators can design their water systems to optimally support liquid cooling.

What is a CDU, and what is its function?

The CDU, or coolant distribution unit, controls the temperature, chemistry, and flow of liquid to the server that is being cooled. CDUs perform a function much like that of a transformer regulating voltage. In this case, the CDU uses a heat exchanger and pumps to regulate the flow and temperature of the fluid delivered to the equipment for cooling. It also isolates the TCS loop from facility systems via the integrated heat exchanger, which is liquid-to-liquid or liquid-to-air.

When building a new data center, how should operators plan the mix of air vs. liquid-cooled servers?

It’s important to understand data centers are not going to be 100% liquid-cooled, even though liquid cooling is becoming a must for high-density environments. The challenge for designers is to build flexibility into their cooling plants. Soon, this will become easier as cooling equipment is introduced that allows switching between air and liquid cooling. The ability to switch will not only deliver much-needed flexibility, but also drive efficiencies that support data center sustainability strategies.

Is it possible to have 100% free cooling liquid-cooled servers? 

Yes, but this depends on server requirements and the climate in the data center’s location. Free cooling uses outdoor air to lower the temperature of the fluid used for liquid cooling. Using outdoor air is a more economical cooling method and suitable for locations with lower temperatures. However, water used for cooling chips doesn’t have to be cold. Temperatures exceeding 100°F are acceptable, depending on the processor in use. This means it’s possible to use free cooling even with warm temperatures, which helps control data center cooling costs and improve overall sustainability.

Resources on liquid cooling and AI-ready data centers

Adapting to the increasing demands generated by AI will continue to be challenging. Thankfully, solutions like liquid cooling can play an important role in adapting to higher-density computing demands. Discover more information about liquid cooling for data centers and transitioning to AI-ready data centers. Also, feel free to ask additional questions using the comment option below. Let’s keep the conversation going.

Tags: , , ,

Add a comment

All fields are required.