Micro ‘Chip Fans’ Can Cool Large Data Centers: xMEMS Chips Integrated into Optical Transceivers, a Key Element in the AI Boom

Micro 'Chip Fans' Can Cool Large Data Centers: xMEMS Chips Integrated into Optical Transceivers, a Key Element in the AI BoomIn data centers, pluggable optical transceivers convert electronic bits into photons, transmitting them to various corners of the room and then converting them back into electronic signals, making it a critical technology for controlling massive amounts of data in artificial intelligence. However, this technology consumes a significant amount of power. According to Nvidia, in a data center with 400,000 GPUs, the power consumption of optical transceivers can reach up to 40 megawatts. Currently, the only way to manage all this heat is to hope for thermal coupling. These transceivers are mounted on the casing of the switch system and cooled. Thomas Tarter, chief thermal engineer at startup xMEMS Labs, states that this is not a good solution, but due to the size of these transceivers being comparable to an oversized USB flash drive, it is not feasible to install a traditional cooling fan in each transceiver.Now, xMEMS claims it has repurposed its upcoming ultrasonic micro-electromechanical systems (MEMS) “chip fan” to be integrated into pluggable optical transceivers to drive airflow and cool the main digital components of the transceiver—the digital signal processor (DSP). Tarter emphasizes that keeping the DSP cool is crucial for its longevity. Considering that each transceiver costs up to $2,000, extending the lifespan of the transceiver by a year or two is highly valuable. Cooling also enhances the integrity of the transceiver’s signals. Unreliable links are believed to be a reason for the prolonged training times of new large language models.xMEMS Cooling Technology Finds a New HomexMEMS’s cooling technology is set to launch in August 2024, based on the company’s earlier product—a solid-state micro speaker for headphones. It utilizes piezoelectric materials that can deform at ultrasonic frequencies, pumping 39 cubic centimeters of air per second through a chip that is only about 1 millimeter high and less than 1 centimeter on each side.Smartphones are too thin to accommodate fans, which is the first obvious application of MEMS coolers, but cooling the rapidly growing AI systems in data centers seems to exceed the capabilities of MEMS technology, as it cannot compete with liquid cooling systems that remove thousands of watts of heat from GPU servers.Mike Housholder, vice president of marketing at xMEMS, states, “The attitude of data center customers has surprised us. We focus on low power consumption. So we don’t think we can be overly confident.”Pluggable optical transceivers are ultimately becoming a data center technology that is closely related to chip fans. Currently, the heat generated by the DSP, photonic ICs, and lasers in transceivers is transferred through thermal coupling to the network switches they connect to (which are typically located at the top of computer racks). Air then flows through the heat sinks built into the surface of the switches, carrying the heat away.xMEMS is collaborating with an unnamed partner to explore how to allow air to flow through the transceivers. These components consume 18 watts or more. However, by placing the company’s MEMS chip in an airflow channel that is thermally coupled to the transceiver chip but physically isolated, the company expects to reduce the temperature of the DSP by more than 15%.xMEMS has been producing prototype MEMS chips at Stanford University’s nanofabrication facility, but Housholder states that the company will receive its first mass-produced silicon wafers from TSMC in June. The company expects to begin full production in the first quarter of 2026. “This aligns very well with our early customers,” he said.Dell’Oro Group reports that shipments of transceivers are rapidly increasing. The market analyst predicts that by 2028, shipments of 800 Gb/s and 1.6 Tb/s devices will grow at a rate of over 35% per year. Other innovations in the field of optical communications are also on the horizon, which may impact heat and power consumption. In March of this year, Broadcom launched a new type of DSP that, due to a more advanced chip manufacturing process, can reduce the power consumption of 1.6 Tb/s transceivers by over 20%. Both Broadcom and Nvidia are developing network switches that completely eliminate pluggable transceivers. These new “co-packaged optical devices” perform optoelectronic conversion on silicon chips within the switch chip package.However, Tarter, who has been dedicated to chip cooling research since the 1980s, predicts that MEMS chips will have more applications both inside and outside data centers in the future. “We are learning a lot about applications,” he said, “I have designed two or three dozen basic applications for it, hoping to inspire designers to say, ‘Oh, I can use this in my system.'”spectrum.ieee.org

Leave a Comment