As AI model sizes grow, the demand for AI computing power has significantly increased, making multi-GPU collaborative computing mainstream. However, the traditional PCIe standard faces numerous limitations in terms of transmission speed and scalability, failing to meet the needs of multi-GPU systems.
In 2014, Nvidia introduced NVLink, an interconnect technology specifically designed for high-speed communication between GPUs. Compared to PCIe, NVLink offers higher bandwidth, lower latency, reduced power consumption, memory pooling capabilities, and superior system scalability. This significantly enhances the computational efficiency of multi-GPU systems, establishing NVLink as an indispensable moat for Nvidia in the AI hardware sector.
With each generation of GPU accelerators, NVLink technology continues to evolve. Under the support of NVLink, Nvidia was the first to launch the world's first AI server, DGX, and has consistently improved it in subsequent products. The DGX-2 server is the first AI server to achieve GPUs' all-to-all connectivity, utilizing the NVSwitch chip to interconnect 16 GPUs, significantly enhancing computational efficiency.