CONNECT WITH US

AMD instinct MI400 spotted in patches: up to 8 XCDs on dual interposer dies with new Multimedia I/O Die

Levi Li, DIGITIMES Asia, Taipei 0

Credit: AFP

According to a February 2 report from VideoCardz, AMD is preparing to launch its next-generation Instinct MI400 AI accelerator with a revolutionary architecture. The design features two Active Interposer Dies (AIDs), each containing four Accelerated Compute Dies (XCDs), for a total of eight XCDs. The MI400 series also introduces a dedicated Multimedia I/O Die (MID) to enhance data throughput and processing efficiency.

AMD's official roadmap indicates plans to launch the Instinct MI355X AI accelerator in the second half of 2025, according to ICsmart and Tom's Hardware. Built on an advanced 3nm process node, the MI355X will leverage the latest CDNA 4 architecture, featuring next-gen HBM3E memory with a maximum capacity of 288GB and support for FP4/FP6 data formats to enhance AI compute performance.

Performance leaps with CDNA 4

AMD reports that CDNA 4 achieves a 35-times performance improvement over CDNA 3, with AI compute capabilities increasing sevenfold. The new architecture delivers 50% higher memory capacity and bandwidth, while data transfer speeds improve by 8 TB/s compared to the MI300X. Enhanced networking efficiency further optimizes AI workload execution.

The Instinct MI355X AI GPU demonstrates significant performance gains, delivering up to 2.3 PFLOPS in FP16 mode, an 80% increase over the MI325X. FP8 performance mirrors this improvement, also rising by 80% to reach 4.6 PFLOPS. The accelerator achieves 9.2 PFLOPS in both FP6 and FP4 compute capabilities, marking a substantial advancement in AI processing power.

Accelerator Name

AMD Instinct MI400

AMD Instinct MI350X

AMD Instinct MI325X

AMD Instinct MI300X

AMD Instinct MI250X

GPU Architecture

CDNA Next / UDNA

CDNA 4

Aqua Vanjaram (CDNA 3)

Aqua Vanjaram (CDNA 3)

Aldebaran (CDNA 2)

GPU Process Node

TBD

3nm

5nm+6nm

5nm+6nm

6nm

XCDs (Chiplets)

8 (MCM)

8 (MCM)

8 (MCM)

8 (MCM)

2 (MCM)
1 (Per Die)

GPU Cores

TBD

TBD

19,456

19,456

14,080

GPU Clock Speed

TBD

TBD

2100 MHz

2100 MHz

1700 MHz

INT8 Compute

TBD

TBD

2614 TOPS

2614 TOPS

383 TOPs

FP6/FP4 Compute

TBD

9.2 PFLOPs

n/a

n/a

n/a

FP8 Compute

TBD

4.6 PFLOPs

2.6 PFLOPs

2.6 PFLOPs

n/a

FP16 Compute

TBD

2.3 PFLOPs

1.3 PFLOPs

1.3 PFLOPs

383 TFLOPs

FP32 Compute

TBD

TBD

163.4 TFLOPs

163.4 TFLOPs

95.7 TFLOPs

FP64 Compute

TBD

TBD

81.7 TFLOPs

81.7 TFLOPs

47.9 TFLOPs

VRAM

TBD

288 HBM3E

256 GB HBM3E

192 GB HBM3

128 GB HBM2E

Infinity Cache

TBD

TBD

256 MB

256 MB

n/a

Memory Clock

TBD

8.0 Gbps?

5.9 Gbps

5.2 Gbps

3.2 Gbps

Memory Bus

TBD

8192-bit

8192-bit

8192-bit

8192-bit

Memory Bandwidth

TBD

8 TB/s

6.0 TB/s

5.3 TB/s

3.2 TB/s

Form Factor

TBD

OAM

OAM

OAM

OAM

Cooling

TBD

Passive cooling

Passive cooling

Passive cooling

Passive cooling

TDP (Max)

TBD

TBD

1000W

750W

560W

Source: Wccftech, compiled by DIGITIMES, February 2025

Next-generation architecture drives AI advancement

AMD plans to release its Instinct MI400 AI accelerator series in 2026. The company states that the MI400 lineup will utilize the next-gen AMD CDNA-Next architecture, specifically designed to enhance AI training and inference workloads, though detailed technical specifications remain undisclosed.

Recent AMD patch files, according to Wccftech and Coelacanth-dream, reveal that the MI400 series will incorporate two AIDs, each hosting four XCDs. This represents a significant architectural advancement over the MI300 series, whose AID contains only two XCDs.

The introduction of MID marks a crucial design innovation, separating the multimedia engine from the AID. This approach potentially incorporates additional I/O-related processing capabilities to improve data flow efficiency.

The MI400 series may support up to two MIDs, with each AID potentially featuring a dedicated MID tile to optimize data transfer between compute units and the I/O interface. While the MI350 already employs AMD's Infinity architecture for die-to-die communication, the MI400 advances this technology further.

According to UNIKO's Hardware, the MI400 series targets large-scale AI training and inference workloads and will feature the CDNA-Next architecture. AMD's strategic roadmap suggests CDNA-Next might evolve into UDNA, unifying RDNA and CDNA technologies into a single architecture.