The Weird World Of Liquid Cooling For Datacenters | Hackaday

2022-06-16 09:19:38 By : Ms. COCO jiang

When it comes to high-performance desktop PCs, particularly in the world of gaming, water cooling is popular and effective. However, in the world of datacenters, servers rely on traditional air cooling more often than not, in combination with huge AC systems that keep server rooms at the appropriate temperature.

However, datacenters can use water cooling, too! It just doesn’t always look quite how you’d expect.

Cooling is of crucial importance to datacenters. Letting hardware get too hot increases failure rates and can even impact service availability. It also uses a huge amount of energy, with cooling accounting for up to 40% of energy use in the average datacenter. This flows into running costs, as well, as energy doesn’t come cheap.

Thus, any efficiency gains in cooling a datacenter can have a multitude of benefits. Outside of just improving reliability and cutting down on emissions through lower energy use, there are benefits to density, too. The more effective cooling available, the more servers and processing power that can be stuffed in a given footprint without running into overheating issues.

Water and liquid cooling techniques can potentially offer a step change in performance relative to traditional air cooling. This is due to the fact that air doesn’t have a great heat capacity compared to water or other special liquid coolants. It’s much easier to transfer a great quantity of heat into a liquid. In some jurisdictions, there is even talk of using the waste heat from datacenters to provide district heating, which is much easier with a source of hot liquid carrying waste heat vs. hot air.

However, liquid cooling comes with drawbacks, too. Leaks can damage electronics if not properly managed, and such systems typically come with added complexity versus running simple fans and air conditioning systems. Naturally, that improved cooling performance comes at a trade-off, else it would be the norm already.

The most obvious water-cooling approach for a datacenter would be to swap out fan coolers in servers for water blocks, and link up racks to water cooling circuits. This is achievable, with some companies offering direct-to-chip cooling blocks that can then be hooked into a broader liquid cooling loop in a supporting server rack. It’s the same theory as water cooling a desktop PC, replacing fans and heatsinks with water blocks instead. This method of directly water-cooling servers has the benefit that it can extract a lot of heat, with some claims as high as 80 kW per rack.

However, this approach comes with several drawbacks. It requires opening up and modifying servers prior to installation in the rack. This is undesirable for many operators, and any mistakes during installation can introduce defects that are costly to rectify in both time and equipment. Service and maintenance is also complicated by the need to break water cooling connections when removing servers, too, though this is assuaged somewhat by special “dripless” quick-connect fittings.

A less invasive method involves the use of regular air-cooled servers that are placed in special water-cooled racks. This method removes any need to modify server hardware. Instead, air-to-water heat exchangers mounted at the back of the server rack pick up the heat from the hot server exhaust and dump it in into the liquid coolant. The exhaust air is thus chilled and returns to the room, while the coolant carries the waste heat away. Rooftop cooling towers, like the ones pictured at the top of this article, can then be used to extract the heat from the coolant before it’s returned. It’s not as effective as directly capturing the heat from an on-chip waterblock, but claims are that such systems can extract up to 45 kW of heat per rack.

In addition to using unmodified hardware, the system cuts down on the danger of leaks significantly. Any leaks that happen will be in the back of the server rack, rather than directly on the server’s circuit boards. Additionally, systems typically run at negative pressure so air is sucked in from any holes or damaged tubes, rather than liquid being allowed to leak out.

More extreme methods, exist, too. Microsoft made waves by running a fully-submerged datacenter off the coast of Scotland back in 2018. With a cluster of conventional servers installed in a watertight tube, heat was rejected to the surrounding waters which kept temperatures very stable. The project ran for two years, and found that the sealed atmosphere and low temperatures were likely responsible for an eight-fold increase in reliability. Project Natick, as it was known, also promised other benefits, such as reduced land costs from locating the hardware offshore.

Microsoft isn’t resting on its laurels, though, and has investigated even wilder concepts of late. The company has developed a two-phase immersion cooling tank for datacenter use. In this design, conventional servers are submerged in a proprietary liquid developed by 3M, which boils at a low temperature of just 50 C (122 F). As the server hardware heats up, the liquid heats up.  It sucks up huge amounts of energy in what is called the latent heat of vaporization, required for the liquid to boil. The gaseous coolant then reaches the condenser on the tank lid, turning back to liquid and raining back down on the servers below.

The immersion method makes for excellent heat transfer between the server hardware and the coolant. As a bonus, it doesn’t just cool down a small section of the CPU via a heatsink. Instead, the entire server is free to dump heat into the liquid. The hope is that this would allow an increase in hardware density in datacenters, as well as an increase in performance, as the high cooling capacity of the immersion method allows for better heat removal in a much smaller space.

Of course, it’s a complex and high-end solution that will take some time before it’s ready for the mainstream. Datacenter operators simply aren’t used to dunking their hardware in liquid, nor used to running them in sealed containers to allow such a system to work. It’s likely that there would also be some maintenance headaches, where immersion tanks would have to be switched off prior to opening them for physical service of the hardware inside.

As humanity continues to crave more computing power, and we strive to cut energy use and emissions, expect further developments in this space. Sheer competition itself is a big driver, too. Any company that can cut running costs, land use, and find more performance will have an advantage over its rivals in the marketplace. Expect watercooling systems to become more mainstream over time, and some of the whackier ideas to find purchase if their major benefits are worth all the hassle. It’s an exciting time to work in datacenter engineering, that much is for sure.

No mention of dropping your server in an aquarium of baby oil? Non conducting, better heat removal, whats not to like except maybe parts replacement.

Mineral oil is fine for personal projects but I doubt anyone has ever done that on a commercial scale as there are actually a lot of downsides. Mineral oil is very good at wicking or climbing up porous materials which makes it surprisingly difficult to contain. And it’s surprisingly difficult to clean off once it’s gotten on something. Not all plastics play nice with all oils so you run the risk of accidentally weakening or even dissolving things like wire insulation, mounting brackets, or even capacitors. And of course oil is flammable.

So you’ve got hundreds of gallons of flammable liquid that keeps escaping its containers and is coating everything and it’s being kept hot by all the servers. All you need is a spark from a wire somewhere whose insulation cracked and you’ve got a nightmare of a fire. What’s not to like?

Actually, there’s a webhosting in Czech Republic that does experiment with oil submersion cooling for a couple of years now (and it’s certainly not a personal project, they’re one of the big players in CZ) so commercial scale usage is happening already. How much of a success it’s going to be, that remains to be seen yet, but they seem pretty invested at this point. Their blog has some updates on that topic from time to time:

Yes, they are trying it for couple of years with two blade servers from HP and basically got nowhere. I would really like to know how they solved the problem, that once something was in the oil bath, it is hard to clean and the oil will just get everywhere.

Not true. They use 3M cooling liquid which is easily cleaned and is not oil based. 3M makes two of such liquids, one is called fluorinert, not sure about the other one.

That is the reason why they use 3M fluorinert instead of oil.

It’s highly flammable. Can’t tell you how I know. But I know.

Global warming vs lol cats. it’s a hard choice.

It is so much more than just lol cats* vs the survival of the human species. It is not an easy choice.

* sad cat angry cat grumpy cat lime cat cold cat warm cat smart cat And do not get me started on kittens they are just krazy.

Ah the days of water-cooling a mainframe.

Everything old is new again.

Back in the ‘80s I was on an NSFnet committee that met at the Minnesota Supercomputer center. Seeing the liquid cooled Crays along with the ETA and IBM machines always amazed me.

Bah! That’s not old. The Univac I, the first commercial computer, had water cooling.

Just like everything being in the “cloud” – mainframe, and everyone using “terminals” to access it. Because you can charge for the time used on the mainframe.

It really is crazy how the cycle of computer costs (expensive, cheap, too cheap) has driven us back down that path.

I ran a overclocked antminer Bitcoin miner submerged in mineral oil cooled by a car heater coil for 2 years without issue. That was over 10 years ago.

I think that’s somewhat under 10 years. Antminers didn’t come out until what, 2014?

Electric immersion heater computers, with an additional ethernet socket, or wifi maybe, for your NAS.

Your computing waste heat goes into your domestic hot water cylinder.

Down Under Geophysics has used immersion cooling in their data center for years.

The NSA was testing a mineral oil analogue for submersion use in their much loved Utah datacenter in 2014:

Another big change happening in the datacenter world is the “cold” side temps going higher.

Back in the day a datacenter could operate at 15-22 C at the cold side of the servers. Today, 30+ C isn’t uncommon, and some apparently look at 45 C. (with the requirement that the hardware in the datacenter is built for this higher ambient temperature.)

Having a higher cold side temp means that one often don’t need an AC to keep cool, even on a warmer day. Now, there is times outdoor temps can reach past the desired “cold” temp. Here we would need an AC again, but the temperature difference we try to pump heat over is much smaller, and therefor our coefficient of performance can be a fair bit higher.

Downside with higher ambient temps is numerous. Everything from shorter component life to sweaty technicians fainting in the warm aisle.

Personally, I think a good server shouldn’t complain if it is 35 C ambient. 45+ I can however agree is a bit warm.

I’ve always been interested in the whole “drop it in a tank of coolant” method, but it seems like the right liquid would be hard to find.

You’d want something non-polar, because that minimizes the opportunity for some contamination turning it into a conductor, but those tend to be oily and it seems like a mess because if you ever need to fix anything because it’s going to take forever to get things clean enough to work on.

So the ideal coolant would evaporate and leave clean boards… oh, but wait… that means it would just evaporate out of the tanks…

I suppose there’s some kind of liquid that has a very heavy vapor, like the stuff they use for vapor-phase soldering, but it seems like maybe it’s not a great idea to be in a room with open tanks full of that stuff all day long.

I used to work for a defense contractor and we had some version of Coolanol coolant for airborne systems, but it was really unpleasant to work with and we were *really* motivated to keep it inside the pipes.

Anybody have any actual experience with anything that isn’t mineral oil that wasn’t uncomfortably ‘exotic’ ?

Well, back before CFCs (and HCFCs) became the baddie, dunking anything in a CFC to cool it was all the rage. Pick pretty much any temperature you want. Non-conductive, non-flammable…. only problem was people started to use it for *everything*, and then it started to leak, and then… if it had all been kept inside the pipes, we’d still be using it.

I wonder what formulation that 3M 50 degree stuff is?

One possibility is PF-5052, The data sheet of which says it’s mostly 2,2,3,3,5,5,6,6-octafluoro-4-(trifluoromethyl)morpholine

Mmmm. I suppose coming from 3M is almost as bad as coming from Dupont. A reformulated CFC that carries a different name but has (mostly) the same properties and gets past the EPA. Good lawyers.

There are quite a few rear door heat exchangers out there. My day job consists of designing IT enclosures and we’ve been through a half dozen manufacturers of such technology. There are much cheaper and less maintenance required means of cooling IT equipment though. As stated above, the requirements for inlet temperature (finally) have been allowed to creep up so “free” air cooling is getting more prevalent. ASHRAE says anything up to 94°F is allowable but you walk into a datacenter and the cold side is frequently set mid to high 60s because airflow problems cause hot spots. They’re frequently haphazardly designed and poorly maintained.

Any time you don’t need to run a chiller pump your PUE goes down. With good containment design, good cabinet design, and use of heat exchangers rather than refrigeration units the up front costs are quickly paid for in energy savings.

Using liquid cooling for beer making I can relate to the problems. First, anything “cold” will condense water which will then drip onto anything you don’t want to get wet. Also, every single “dripless” connector I’ve used for beer and other industrial applications isn’t perfect and still drips a little. Making beer it doesn’t matter much but pulling out a big expensive switch or something I’d be a lot more worried about water.

Just a mention because I don’t think it was touched on. Commercial A/C at this scale uses a large cold water loop to transfer heat out of the building. So some of these chillers could be using a heat exchanger that provides a secondary loop into the machines inside the building.

The power density becomes a problem too as you stack over 100s of AI GPU or CPU each pulling 100s of watts into a single rack.

This is bleeding edge, and it does bleed. Check out Hetzner datacenter tour from Der8auer and see how to do it the cheap way xD

Wrangling liquids at datacenter-scale sounds like a pain in the butt! IMO direct connections to the machine only makes sense if you have fewer beefier servers.

I think the water cooling in the racks is my favorite solution because it’s totally server agnostic and has the least chance of drowning a server. The place I worked had many 1U “pizza box” servers, some co-located servers (customer owned), some blade servers, etc. A real mashup.

@Maave said: “Check out Hetzner datacenter tour from Der8auer and see how to do it the cheap way xD”

* Virtual tour of Hetzner Online Data Center Park:

Back in time I worked on a water-cooled IBM3090 – MVS/XA – GDDM. Great memories !

Oh yes, I recall a 3083 that early one morning refused to power up. You could hear the pumps in one of the service boxes start and then it would trip out with no indication of why. Tried a few more times, then put in a service call to IBM and went home. Later that morning I came in to see a few dozen empty distilled water bottles lined up in the hallway outside the data center. Turns out the plumbing had sprung a leak.

They extent these companies are going to, and the money spent trying to solve these issues gives insight to how wasteful these buildings are. Wondering where the environmentalists are on this.

IBM model 7302 core memory array was immersed in a tank of temperature controlled oil. This 128K memory was first shipped in 1959 and was prized for its 2.18 microsecond access time.

As mentioned previously, chiller doors are a thing ( for example) which is a very good solution where you have sufficient power infrastructure but insufficient HVAC.

However, modern high density servers can have multiple kW of fans in a rack due to having to use high back pressure double-stacked screamer fans, so that’s certainly not efficient and there is some savings to be had there. The other thing when you get to waterblock-type liquid cooling is you don’t have to force air though an enormous heatsink, so your fans can be less powerful

Immersion cooling is a neat science project but maintenance is a PITA, there are things you can’t submerge (like spinning hard drives and fiber optics, depending on the fluid) and the working fluids can be silly expensive if you want to get away from some variety of mineral oil. In addition, you’ll never get the same compute density per unit floor space as a 9’ tall rack with more conventional liquid cooling.

Wonder why they dump the energy straight into the environment. When used to heat (public) pools, this ‘low quality energy’ is very useful. It most likely saves the entire energy consumption of such pools. Or move servers to homes, as this page describes: Living in a well-insulated home myself, we can heat the entire office in the loft by as single NAS. In summertime it has to be moved to the basement though, to be able to dump its heat, or the office becomes uncomfortably warm.

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.

By using our website and services, you expressly agree to the placement of our performance, functionality and advertising cookies. Learn more