Daily Archives: October 2, 2023

Cooling with Liquids

As data centers worldwide generate increasing amounts of heat as they consume ever more power, removing that heat is becoming a huge concern. As a result, they are turning to liquid cooling as an option. This became evident with the global investment company KKR acquiring CoolIT Systems, a company making liquid cooling gear for the past two decades. With this investment, CoolIT will be scaling up its operations for global customers in the data-center market. According to CoolIT, liquid cooling will play a critical role in reducing the emission footprint as data and computing need increase.

Companies investing in high-performance servers are also already investing in liquid cooling. These high-performance servers typically have CPUs consuming 250-300W and GPUs consuming 300-500W of power. When catering to demanding workloads such as AI training, servers often require up to eight GPUs, so they could be drawing 7-10kW per node.

Additionally, with data centers increasing their rack densities, and using more memories per node, along with higher networking performance, the power requirements of servers go up significantly. With the current trend to shift to higher chip or package power densities, liquid cooling is turning out to be the preferred option, as it is highly efficient.

Depending on the application, companies are opting for either direct contact liquid cooling, or immersion cooling. With direct contact liquid cooling, also known as direct-to-chip cooling, companies like Atos/Bull have built their own power-dense HPC servers. They pack six AMD Epyc sockets with maximum memory, 100Gbps networking, and NVMe storage, into a 1U chassis that they cool with a custom cooling manifold.

CoolIT supports direct cooling technology. They circulate a coolant, typically water, through metal plates, which they have attached directly to the hot component such as a GPU or processor. According to CoolIT, this arrangement is easier to deploy within existing rack infrastructures.

On the other hand, immersion cooling requires submerging the entire server node in a coolant. The typical coolant is a dielectric, non-conductive fluid. However, this arrangement calls for specialized racks. The nodes may have to be positioned vertically rather than being stacked horizontally. Therefore, it is easier to deploy this kind of system for newer builds of server rooms.

Cloud operators in Europe, such as OVHcloud, are combining both the above approaches in their systems. For this, they are attaching the water block to the CPU and GPU, while immersing the rest of the components in the dielectric fluid.

According to OVHcloud, the combined system has much higher efficiency compared to air cooling. They tested their setup, and it showed a partial power usage effectiveness or PUE rating of 1.004. This is the energy used for the cooling system.

However, the entire arrangement must have a proper approach, such as accounting for the waste heat. For instance, merely dumping the heat into a lake or river can be harmful. Liquid cooling does improve efficiency while also helping the environment, as it lowers the necessity to run compressor-based cooling. Instead, it is possible to use heat-exchanger technology to keep the temperature of the cooling loop low enough.