CONNECT WITH US

Nvidia H200's greatest rival: AMD MI325X GPU

Monica Chen, San Francisco; Jessie Shen, DIGITIMES Asia 0

Credit: DIGITIMES

AMD is largely regarded as the only firm capable of challenging Nvidia's dominant position in the AI GPU market. AMD recently held its Advancing AI conference in San Francisco, where CEO Lisa Su, along with major clients and partners such as Microsoft, OpenAI, Meta, Oracle, and Google Cloud, unveiled the latest AI and high-performance computing solutions, including the fifth-generation EPYC server processors and the Instinct MI325X series.

Su's keynote address at the event showcased a diverse array of new products intended to enhance computational power and energy efficiency, thereby providing enterprises and consumers with innovative AI capabilities. These include the EPYC 9005 series processors, the Instinct MI325X accelerator, and the Pensando Salina DPU and Pensando Pollara 400 products.

Important among the announcements are the forthcoming releases of the MI325X, MI350, and MI400 accelerator series, which are scheduled for the fourth quarter of 2024, mid-2025, and 2026, respectively. The MI350 will employ TSMC's 3nm process technology, while the MI325X will directly compete with Nvidia's ubiquitous H200 in performance and features.

"The data center and AI represent significant growth opportunities for AMD, and we are building strong momentum for our EPYC and AMD Instinct processors across a growing set of customers," Su remarked. "Looking ahead, we see the data center AI accelerator market growing to $500 billion by 2028. We are committed to delivering open innovation at scale through our expanded silicon, software, network, and cluster-level solutions."

MI325X versus Nvidia H200 with HBM3E

It's worth mentioning that AMD's data center segment generated a record US$2.8 billion in revenue in the second quarter of 2024, a 115% rise year on year. This growth was primarily due to a substantial increase in Instinct GPU shipments, which accounted for approximately half of the total revenue. AMD recently raised its 2024 data center GPU revenue prediction from US$4 billion to US$4.5 billion.

Since launching in December 2023, AMD Instinct MI300X accelerators have been deployed at scale by leading cloud, OEM, and ODM partners and are serving millions of users daily on popular AI models, including OpenAI's ChatGPT, Meta Llama, and over one million open-source models on the Hugging Face platform.

The MI325X, utilizing TSMC's 5nm technology, delivers outstanding AI capabilities and memory efficiency for intensive AI tasks, AMD claimed. The MI325X has substantially improved memory capacity and bandwidth, featuring 256GB of HBM3E memory with a throughput of 6.0TB/s, offering 1.8 times the capacity and 1.3 times the bandwidth of Nvidia's H200, as well as 1.3 times the theoretical peak FP16 and FP8 computational capability. This enhancement in memory and computation facilitates up to 1.3 times the inference performance of Mistral 7B FP16, 1.2 times that of Llama 3.1 70B FP8, and 1.4 times that of Mixtral 8x7B FP16.

The MI325X is anticipated to be in widespread availability from platform suppliers such as Dell, Eviden, Gigabyte, HPE, Lenovo, and Supermicro commencing in the first quarter of 2025. Mass production and shipment are scheduled to commence in the fourth quarter of 2024.

The MI350 series, founded on AMD's CDNA 4 architecture and employing TSMC's 3nm process, is expected to debut in mid-2025, offering a 35-fold enhancement in inference performance compared to the preceding CDNA 3-based MI300 series, while further establishing its dominance in memory capacity, with each accelerator featuring up to 288GB of HBM3E memory.

Enhancing AI software and open industry ecosystems

AMD is advancing its ROCm open software stack by incorporating new features to facilitate generative AI workload training and inference. ROCm 6.2 now supports essential AI features including FP8 data types, Flash Attention 3, and Kernel Fusion. ROCm 6.2 offers up to a 2.4-fold increase in inference performance and a 1.8-fold gain in LLM training performance compared to ROCm 6.0.

AI network solutions to launch in 1H25

AMD is utilizing hyperscalers to deploy the widest variety of programmable DPUs to power the next generation of AI networks. The AI network is composed of two components: the front end, which provides data and information to AI clusters, and the back end, which manages data transmission between accelerators and clusters. These components are essential for the efficient utilization of CPUs and accelerators in AI infrastructure.

To efficiently manage these two networks while driving overall system speed, scalability, and efficiency, AMD released the Pensando Salina DPU for the front end and the industry's first UEC-ready Pensando Pollara 400 AI NIC for the back end.

The Pensando Salina DPU is the third generation of the world's most potent programmable DPU. It provides up to two times the performance, bandwidth, and scale of the previous generation DPU, and it supports 400G throughput for rapid data transfer speed. It is a critical component of AI front-end network clusters, enhancing the performance, efficiency, security, and scalability of data-driven AI applications.

The Pensando Pollara 400, powered by the AMD P4 programmable engine, is the industry's first UEC-ready AI NIC, with support for next-generation RDMA software and an open networking ecosystem. The AMD Pensando Pollara 400 is critical for providing top-tier performance, scalability, and communication efficiency among back-end network accelerators.

The Pensando Salina DPU and Pensando Pollara 400 are scheduled for sampling in the fourth quarter of 2024, with a planned launch in the first half of 2025.

4nm Ryzen AI PRO 300 enhances battery life; 100+ AI PCs in 2025

The Ryzen AI PRO 300 series notebook processors, employing Zen 5 and XDNA 2 architectures and utilizing TSMC's 4nm technology, have demonstrated substantial enhancements in performance, battery longevity, and security. They are already included in the first enterprise-designed Microsoft Copilot+ notebook product.

Additionally, AMD conducted its first developer conference, with executives from several global companies such as Microsoft, Cohere, Meta, and Google DeepMind, demonstrating how AMD software enhances the open industry ecosystem.