In fact, at the system level, Blackhole Galaxy should be competitive with Nvidia's HGX/DGX H100 and H200 systems, which manage roughly 15.8 petaFLOPS of dense FP8. Tenstorrent's use of onboard ...
In terms of raw FLOPS, the drop to FP4 nets Nvidia's best specced Blackwell parts a 5x performance boost over the H100 running at FP8. Blackwell also boasts 1.4x more HBM that happens to offer 1 ...
While Intel is targeting Nvidia’s H100 that debuted in 2022 and the recently ... AI compute performance using the 8-bit floating point (FP8) format and quadruple the performance using the ...
Existing HGX H100-based systems are software- and hardware ... the system can provide “32 petaflops of FP8 deep learning compute and 1.1TB of aggregate bandwidth memory for the highest ...