According to a February 2 report from VideoCardz, AMD is preparing to launch its next-generation Instinct MI400 AI accelerator with a revolutionary architecture. The design features two Active Interposer Dies (AIDs), each containing four Accelerated Compute Dies (XCDs), for a total of eight XCDs. The MI400 series also introduces a dedicated Multimedia I/O Die (MID) to enhance data throughput and processing efficiency.
AMD's official roadmap indicates plans to launch the Instinct MI355X AI accelerator in the second half of 2025, according to ICsmart and Tom's Hardware. Built on an advanced 3nm process node, the MI355X will leverage the latest CDNA 4 architecture, featuring next-gen HBM3E memory with a maximum capacity of 288GB and support for FP4/FP6 data formats to enhance AI compute performance.
Performance leaps with CDNA 4
AMD reports that CDNA 4 achieves a 35-times performance improvement over CDNA 3, with AI compute capabilities increasing sevenfold. The new architecture delivers 50% higher memory capacity and bandwidth, while data transfer speeds improve by 8 TB/s compared to the MI300X. Enhanced networking efficiency further optimizes AI workload execution.
The Instinct MI355X AI GPU demonstrates significant performance gains, delivering up to 2.3 PFLOPS in FP16 mode, an 80% increase over the MI325X. FP8 performance mirrors this improvement, also rising by 80% to reach 4.6 PFLOPS. The accelerator achieves 9.2 PFLOPS in both FP6 and FP4 compute capabilities, marking a substantial advancement in AI processing power.
Accelerator Name | AMD Instinct MI400 | AMD Instinct MI350X | AMD Instinct MI325X | AMD Instinct MI300X | AMD Instinct MI250X |
GPU Architecture | CDNA Next / UDNA | CDNA 4 | Aqua Vanjaram (CDNA 3) | Aqua Vanjaram (CDNA 3) | Aldebaran (CDNA 2) |
GPU Process Node | TBD | 3nm | 5nm+6nm | 5nm+6nm | 6nm |
XCDs (Chiplets) | 8 (MCM) | 8 (MCM) | 8 (MCM) | 8 (MCM) | 2 (MCM) |
GPU Cores | TBD | TBD | 19,456 | 19,456 | 14,080 |
GPU Clock Speed | TBD | TBD | 2100 MHz | 2100 MHz | 1700 MHz |
INT8 Compute | TBD | TBD | 2614 TOPS | 2614 TOPS | 383 TOPs |
FP6/FP4 Compute | TBD | 9.2 PFLOPs | n/a | n/a | n/a |
FP8 Compute | TBD | 4.6 PFLOPs | 2.6 PFLOPs | 2.6 PFLOPs | n/a |
FP16 Compute | TBD | 2.3 PFLOPs | 1.3 PFLOPs | 1.3 PFLOPs | 383 TFLOPs |
FP32 Compute | TBD | TBD | 163.4 TFLOPs | 163.4 TFLOPs | 95.7 TFLOPs |
FP64 Compute | TBD | TBD | 81.7 TFLOPs | 81.7 TFLOPs | 47.9 TFLOPs |
VRAM | TBD | 288 HBM3E | 256 GB HBM3E | 192 GB HBM3 | 128 GB HBM2E |
Infinity Cache | TBD | TBD | 256 MB | 256 MB | n/a |
Memory Clock | TBD | 8.0 Gbps? | 5.9 Gbps | 5.2 Gbps | 3.2 Gbps |
Memory Bus | TBD | 8192-bit | 8192-bit | 8192-bit | 8192-bit |
Memory Bandwidth | TBD | 8 TB/s | 6.0 TB/s | 5.3 TB/s | 3.2 TB/s |
Form Factor | TBD | OAM | OAM | OAM | OAM |
Cooling | TBD | Passive cooling | Passive cooling | Passive cooling | Passive cooling |
TDP (Max) | TBD | TBD | 1000W | 750W | 560W |
Source: Wccftech, compiled by DIGITIMES, February 2025
Next-generation architecture drives AI advancement
AMD plans to release its Instinct MI400 AI accelerator series in 2026. The company states that the MI400 lineup will utilize the next-gen AMD CDNA-Next architecture, specifically designed to enhance AI training and inference workloads, though detailed technical specifications remain undisclosed.
Recent AMD patch files, according to Wccftech and Coelacanth-dream, reveal that the MI400 series will incorporate two AIDs, each hosting four XCDs. This represents a significant architectural advancement over the MI300 series, whose AID contains only two XCDs.
The introduction of MID marks a crucial design innovation, separating the multimedia engine from the AID. This approach potentially incorporates additional I/O-related processing capabilities to improve data flow efficiency.
The MI400 series may support up to two MIDs, with each AID potentially featuring a dedicated MID tile to optimize data transfer between compute units and the I/O interface. While the MI350 already employs AMD's Infinity architecture for die-to-die communication, the MI400 advances this technology further.
According to UNIKO's Hardware, the MI400 series targets large-scale AI training and inference workloads and will feature the CDNA-Next architecture. AMD's strategic roadmap suggests CDNA-Next might evolve into UDNA, unifying RDNA and CDNA technologies into a single architecture.