Development of NVLink

Nick Chen

Nvidia established a well-rounded ecosystem via NVLink, and the technology's scalability on systems become the development key for AI computing.

Abstract

As AI model sizes grow, the demand for AI computing power has significantly increased, making multi-GPU collaborative computing mainstream. However, the traditional PCIe standard faces numerous limitations in terms of transmission speed and scalability, failing to meet the needs of multi-GPU systems.

In 2014, Nvidia introduced NVLink, an interconnect technology specifically designed for high-speed communication between GPUs. Compared to PCIe, NVLink offers higher bandwidth, lower latency, reduced power consumption, memory pooling capabilities, and superior system scalability. This significantly enhances the computational efficiency of multi-GPU systems, establishing NVLink as an indispensable moat for Nvidia in the AI hardware sector.

With each generation of GPU accelerators, NVLink technology continues to evolve. Under the support of NVLink, Nvidia was the first to launch the world's first AI server, DGX, and has consistently improved it in subsequent products. The DGX-2 server is the first AI server to achieve GPUs' all-to-all connectivity, utilizing the NVSwitch chip to interconnect 16 GPUs, significantly enhancing computational efficiency.

Table of contents

Introduction
NVLink V.S. PCIe
Table 1: GPU throughput per second for NVLink and PCIe by data batch sizes
NVLink breakdown
Integration in DGX
Table 2: Internal transmission architecture diagram of DGX-2
Scalability
Table 3: Transmission architectures of A100 and H100 SuperPOD
NVLink network supercomputing node servers
Table 4: NVLink specifications by generation
NVLink-supported solutions
Table 5: NVLink-supported GPU product lines under H100 series
Table 6: NVLink switch and application
Competition
Table 7: Infinity Fabric architecture and MI300 X diagram
Summary

Development of NVLink

Single Report

Annual Servers subscription

Team or Enterprise subscription