The World’s fastest single node fully integrated AI system, the DGX H100.
Hopper - 4th Generation Tensor Cores
First introduced in the NVIDIA Volta architecture, NVIDIA Tensor Core technology has brought dramatic speedups to AI, bringing down training times from weeks to hours and providing massive acceleration to inference.
The NVIDIA Hopper architecture builds upon these innovations by bringing new precisions -Tensor Float (TF32) and Floating Point 64 (FP64) - to accelerate and simplify AI adoption and extend the power of Tensor Cores to HPC.
TF32 works just like FP32 while delivering speedups of up to 20X for AI without requiring any code change. Using NVIDIA Automatic Mixed Precision, researchers can gain an additional 2X performance with automatic mixed precision and FP16 adding just a couple of lines of code. And with support for bfloat16, INT8, and INT4, Tensor Cores in NVIDIA H100 Tensor Core GPUs create an incredibly versatile accelerator for both AI training and inference. Bringing the power of Tensor Cores to HPC, A100 also enables matrix operations in full, IEEE-certified, FP64 precision.
The Universal System For AI Infrastructure
Deep neural networks are rapidly growing in size and complexity, in response to the most pressing challenges in business and research.
The computational capacity needed to support today’s modern AI workloads has outpaced traditional data center architectures. Modern techniques that exploit use of model parallelism are colliding with the limits of inter-GPU bandwidth, as developers build increasingly large accelerated computing clusters and push the limits of data center scale. A new approach is needed - one that delivers almost limitless AI computing scale in order to break through the barriers to achieving faster insights.
GPU to GPU communication is achieved via Nvidia’s 3rd generation NVlink architecture using 8x internal NVswitches.
This doubles the GPU-to-GPU direct bandwidth to 600 gigabytes per second (GB/s), almost 10X higher than PCIe Gen 5, and a new NVIDIA NVSwitch that is 2X faster than the last generation. This unprecedented power delivers the fastest time-to-solution, allowing users to tackle challenges that weren't possible or practical before.
Data Center Scalability With Mellanox
Networking is via Nvidia’s recent Mellanox acquisition, and features 9x Mellanox ConnectX-6 200Gb/s network interfaces on a PCIe gen5 bus.
With the fastest I/O architecture of any DGX system, NVIDIA DGX H100 is the foundational building block for large AI clusters like NVIDIA DGX SuperPOD, the enterprise blueprint for scalable AI infrastructure. DGXH100 features eight single-port Mellanox ConnectX-6 VPI HDR InfiniBand adapters for clustering and 1 dualport ConnectX-6 VPI Ethernet adapter for storage and networking, all capable of 200Gb/s.
The combination of massive GPU-accelerated compute with state-of-the-art networking hardware and software optimizations means DGX H100 can scale to hundreds or thousands of nodes to meet the biggest challenges, such as conversational AI and large scale image classification.
Proven Infrastructure Solutions Built With Trusted Data Center Leaders
As an Nvidia Elite Partner, we offer a portfolio of infrastructure solutions that incorporates the Hopper architecture and the best of the NVIDIA DGX POD reference architecture.
Delivered as fully integrated, ready-to-deploy, these solutions make data centre AI deployments simpler and faster for IT.