🦾 A leap from chips to a comprehensive AI platform designed for trillion-parameter scale generative AI.
🚀 Performance Enhancements: B200 GPU offers up to 20 petaflops of FP4 horsepower with 208 billion transistors.
🔥 GB200 combines two B200 GPUs and a Grace CPU for 30x LLM inference workload performance, with 25x efficiency over H100.
✔ Energy and Cost Efficiency: Training a 1.8 trillion parameter model now takes 2,000 Blackwell GPUs and 4 megawatts, compared to 8,000 Hopper GPUs and 15 megawatts previously.
💥 On a GPT-3 benchmark, GB200 is 7x more performant than H100 and offers 4x the training speed.
🤖 Key Technical Improvements: Second-gen transformer engine that doubles compute, bandwidth, and model size.
👉 New NV Link switch for enhanced GPU communication, allowing 576 GPUs to connect with 1.8 TB/s bandwidth.
📢 Large-Scale AI Systems: GB200 NVL72 racks combine 36 CPUs and 72 GPUs for up to 1.4 exaflops of inference performance, targeting large-scale AI training and inference tasks.