<thead id="zbxfl"><delect id="zbxfl"><output id="zbxfl"></output></delect></thead>
    <sub id="zbxfl"><var id="zbxfl"></var></sub>

        <thead id="zbxfl"><var id="zbxfl"><output id="zbxfl"></output></var></thead><thead id="zbxfl"><var id="zbxfl"><ruby id="zbxfl"></ruby></var></thead>
        <sub id="zbxfl"><delect id="zbxfl"><output id="zbxfl"></output></delect></sub><sub id="zbxfl"><delect id="zbxfl"><output id="zbxfl"></output></delect></sub>
        <sub id="zbxfl"><var id="zbxfl"><mark id="zbxfl"></mark></var></sub>

        <sub id="zbxfl"><var id="zbxfl"></var></sub><sub id="zbxfl"></sub>

        <sub id="zbxfl"><var id="zbxfl"><ruby id="zbxfl"></ruby></var></sub>


        The Building Blocks of Advanced Multi-GPU Communication


        How NVLink and NVSwitch Work Together

        NVIDIA NVLink

        NVIDIA A100 PCIe with NVLink GPU-to-GPU connection
        NVIDIA A100 with NVLink GPU-to-GPU connections

        NVIDIA NVSwitch

        The NVSwitch topology diagram

        Maximizing System Throughput

        Third-Generation NVLINK

        NVIDIA NVLink technology addresses interconnect issues by providing higher bandwidth, more links, and improved scalability for multi-GPU system configurations. A single NVIDIA A100 Tensor Core GPU supports up to 12 third-generation NVLink connections for a total bandwidth of 600 gigabytes per second (GB/sec)—almost 10X the bandwidth of PCIe Gen 4. 

        Servers like the NVIDIA DGX? A100 take advantage of this technology to deliver greater scalability for ultrafast deep learning training. NVLink is also available in A100 PCIe two-GPU configurations.  

        NVLink Performance

        NVLink in NVIDIA A100

        NVIDIA NVSwitch

        NVSwitch—The Fully Connected NVLink

        The rapid adoption of deep learning has driven the need for a faster, more scalable interconnect, as PCIe bandwidth often creates a bottleneck at the multi-GPU-system level. For deep learning workloads to scale, dramatically higher bandwidth and reduced latency are needed.

        NVIDIA NVSwitch builds on the advanced communication capability of NVLink to solve this problem. It takes deep learning performance to the next level with a GPU fabric that enables more GPUs in a single server and full-bandwidth connectivity between them. Each GPU has 12 NVLinks per NVSwitch to enable high-speed, all-to-all communication.


        The Most Powerful End-to-End AI and HPC Data Center Platform

        NVLink and NVSwitch are essential building blocks of the complete NVIDIA data center solution that incorporates hardware, networking, software, libraries, and optimized AI models and applications from NGC?. The most powerful end-to-end AI and HPC platform, it allows researchers to deliver real-world results and deploy solutions into production, driving unprecedented acceleration at every scale.

        Full Connection for Unparalleled Performance

        NVSwitch is the first on-node switch architecture to support eight to 16 fully connected GPUs in a single server node. The second-generation NVSwitch drives simultaneous communication between all GPU pairs at an incredible 600 GB/s. It supports full all-to-all communication with direct GPU peer-to-peer memory addressing. These 16 GPUs can be used as a single high-performance accelerator with unified memory space and up to 10 petaFLOPS of deep learning compute power.


        • NVIDIA NVLink

          NVIDIA NVLink

        • NVIDIA NVSwitch

          NVIDIA NVSwitch

          Second Generation Third Generation
        Total NVLink Bandwidth 300 GB/s 600 GB/s
        Maximum Number of Links per GPU 6 12
        Supported NVIDIA Architectures NVIDIA Volta? NVIDIA Ampere Architecture
          First Generation Second Generation
        Number of GPUs with Direct Connection Up to 16 Up to 16
        NVSwitch GPU-to-GPU Bandwidth 300 GB/s 600 GB/s
        Total Aggregate Bandwidth 4.8 TB/s 9.6 TB/s
        Supported NVIDIA Architectures NVIDIA Volta NVIDIA Ampere Architecture

        Get Started

        Experience NVIDIA DGX A100, the universal system for AI infrastructure and the world’s first AI system built on the NVIDIA A100 Tensor Core GPU.