CoreWeave's latest testing shows that, using the DeepSeek R1 inference model, four NVIDIA Blackwell-based GB300 chips can complete tasks that would have required 16 H100s, increasing single-card throughput by 6 times.
Deepseek R1 Inference Test: Four NVIDIA GB300s Can Do the Work of 16 H100s
CoreWeave reportedly used the Deepseek R1 inference model to compare and evaluate the performance of the NVIDIA Blackwell-based GB300 NVL72 GPU and the previous-generation H100 GPU. Thanks to NVIDIA's upgraded architecture and enhanced memory and bandwidth, the test results show that the GB300 can complete tasks that would have required 16 H100s using only four GPUs.
The GB300 NVL72 platform supports up to 37TB of memory capacity (with a maximum of 40TB) and boasts 130TB/s of memory bandwidth. To reduce data splitting between GPUs, the platform utilizes a four-way parallel design and improves communication efficiency through NVLink and NVSwitch high-speed interconnects.
CoreWeave notes that this represents not only an increase in FLOPs computing power, but also a significant leap in system architecture efficiency in real-world business scenarios. For enterprise customers who need to run complex models, the GB300 NVL72 offers increased scalability and lower latency, enabling them to deploy and operate AI services faster and more cost-effectively.