CONNECT WITH US

Nvidia's Blackwell rack design reportedly faces overheating, potentially delaying shipments

Chia-Han Lee, Taipei; Levi Li, DIGITIMES Asia 0

Credit: Bloomberg

Nvidia's upcoming Blackwell GPUs, already delayed to market, are reportedly facing severe overheating issues in high-capacity server racks, raising concerns over deployment timelines.

Nvidia has requested suppliers to revise rack designs as concerns mount among data center operators over delayed Blackwell shipments amid AI accelerator shortages.

Overheating issues for Nvidia's high-power NVL72 racks

Bloomberg, Reuters, and Tom's Hardware, citing The Information, report that Nvidia's NVL72 racks—housing up to 72 Blackwell chips for AI and HPC—face severe overheating issues. These racks consume 120 kW per unit, risking performance degradation and hardware damage.

The NVL72 rack, featuring 36 Grace CPUs and 72 Blackwell GPUs, weighs 1.5 tons and consumes up to 132 kW—setting a record for single-server power usage, Businesskorea reports.

The overheating problem has reportedly raised customer concerns over disruptions to next-generation AI data center deployments.

Nvidia employees, suppliers, and customers have confirmed the issue. Nvidia has requested design changes from suppliers but has not disclosed the names involved.

Nvidia downplayed the issue. "Nvidia is working with leading cloud service providers as an integral part of our engineering team and process," an Nvidia spokesperson told Reuters. "The engineering iterations are normal and expected."

Blackwell delays: Challenges for Nvidia's 2025 prospects

Announced in March 2024 as the successor to the H100 GPUs, Blackwell promises 30x performance gains and up to 25% energy savings on select workloads.

Originally set for shipment in late 2024, Blackwell's launch was delayed to early 2025 due to design flaws.

These setbacks could disrupt AI data center plans for key clients like Meta, Google, and Microsoft while challenging Nvidia's AI hardware dominance.

Analysts stress that Blackwell's success is vital to Nvidia's market leadership, with technical fixes and timely delivery being top priorities.

This has led some Wall Street firms to take a more cautious view of Nvidia's 2025 outlook.

Wccftech reports that Morgan Stanley initially projected Nvidia would ship up to 450,000 Blackwell GPUs in the fourth quarter of 2024, rising to 700,000–800,000 units in the first quarter of 2025.

Raymond James, however, has revised its fourth-quarter estimate to just 100,000 units but remains optimistic about 2025, anticipating strong full-year demand for Blackwell GPUs.

The investment bank Jefferies has cautioned that despite Nvidia's strong recent performance, market expectations for its future revenue and profit might be overly ambitious.

The bank estimates Nvidia will ship around 6 million GPUs in 2025, below market forecasts. It also projects data center revenue of US$180 billion to US$200 billion, lower than the market average of US$205 billion to US$215 billion.

Financial risks amid Blackwell's uncertainty

It is still unclear if overheating issues will delay Blackwell's early 2025 launch, but Nvidia has significant stakes in getting it right. Each GB200 Grace Blackwell superchip costs up to US$70,000, while a complete server rack exceeds US$3 million.

Nvidia aims to sell 60,000–70,000 servers, making further delays costly for a company now among the world's most valuable due to its dominance in AI.

SiliconANGLE cites Holger Mueller of Constellation Research, who stresses that cooling is vital for AI platforms as high-powered chips risk damage at elevated temperatures. Nvidia has acknowledged the issue but has not clarified its severity.

"The questions are how expensive will this be to fix, and how long will it take?" Mueller said. "Blackwell is by far the most compelling platform for generative AI, so customers have no choice but to sit tight and wait for a fix. The duration of that wait will determine if there's any impact on Nvidia's sky-high stock price."

Mueller added that Blackwell remains the most compelling generative AI platform, leaving customers no choice but to wait. The length of that wait could affect Nvidia's stock price.