May 16, 2022

Botu Linum

The Car & Automotive Devotees

NVIDIA H100 80 GB PCIe Accelerator With Hopper GPU Is Priced Over $30,000 US In Japan

4 min read

NVIDIA’s recently announced H100 80 GB PCIe accelerator based on the Hopper GPU architecture has been listed for sale in Japan. This is the second accelerator that has been listed along with its price in the Japanese market with the first one being the AMD MI210 PCIe which was also listed just a few days back.

NVIDIA H100 80 GB PCIe Accelerator With Hopper GPU Gets Listed In Japan For An Insane Price Exceeding $30,000 US

Unlike the H100 SXM5 configuration, the H100 PCIe offers cut-down specifications, featuring 114 SMs enabled out of the full 144 SMs of the GH100 GPU and 132 SMs on the H100 SXM. The chip as such offers 3200 FP8, 1600 TF16, 800 FP32, and 48 TFLOPs of FP64 compute horsepower. It also features 456 Tensor & Texture Units.

NVIDIA Unveils Hopper GH100 Powered DGX H100, DGX Pod H100, H100 PCIe Accelerators

Due to its lower peak compute horsepower, the H100 PCIe should operate at lower clocks and as such, features a TDP of 350W versus the double 700W TDP of the SXM5 variant. But the PCIe card will retain its 80 GB memory featured across a 5120-bit bus interface but in HBM2e variation (>2 TB/s bandwidth).

According to gdm-or-jp, a Japanese distribution company, gdep-co-jp, has listed the NVIDIA H100 80 GB PCIe accelerator with a price of ¥4,313,000 ($33,120 US) and a total cost of ¥4,745,950 including sales tax which converts to $36,445 US. The accelerator is expected to ship in the second half of 2022 and will come in the standard dual-slot passively cooled variant. It is also stated that the distributor will provide NVLINK bridges free of cost to those who purchase multiple cards but might ship at a later date.

Now compared to the AMD Instinct MI210 which costs around $16,500 US in the same market, the NVIDIA H100 is more than double the cost. The NVIDIA offering does boast some really high GPU performance figures versus the AMD HPC accelerator at 50W more. The non-tensor FP32 TFLOPs for the H100 are rated at 48 TFLOPs while the MI210 has a peak rated FP32 compute power of 45.3 TFLOPs. With Sparsity and Tensor operations, the H100 can output up to 800 TFLOPs of FP32 horse power. The H100 also rocks higher 80 GB memory capacities versus the 64 GB on the MI210. From the looks of it, NVIDIA is charging the premium for its higher AI/ML capabilities.

NVIDIA Ampere GA100 GPU Based Tesla A100 Specs:

NVIDIA Tesla Graphics Card NVIDIA H100 (SMX5) NVIDIA H100 (PCIe) NVIDIA A100 (SXM4) NVIDIA A100 (PCIe4) Tesla V100S (PCIe) Tesla V100 (SXM2) Tesla P100 (SXM2) Tesla P100
(PCI-Express)
Tesla M40
(PCI-Express)
Tesla K40
(PCI-Express)
GPU GH100 (Hopper) GH100 (Hopper) GA100 (Ampere) GA100 (Ampere) GV100 (Volta) GV100 (Volta) GP100 (Pascal) GP100 (Pascal) GM200 (Maxwell) GK110 (Kepler)
Process Node 4nm 4nm 7nm 7nm 12nm 12nm 16nm 16nm 28nm 28nm
Transistors 80 Billion 80 Billion 54.2 Billion 54.2 Billion 21.1 Billion 21.1 Billion 15.3 Billion 15.3 Billion 8 Billion 7.1 Billion
GPU Die Size 814mm2 814mm2 826mm2 826mm2 815mm2 815mm2 610 mm2 610 mm2 601 mm2 551 mm2
SMs 132 114 108 108 80 80 56 56 24 15
TPCs 66 57 54 54 40 40 28 28 24 15
FP32 CUDA Cores Per SM 128 128 64 64 64 64 64 64 128 192
FP64 CUDA Cores / SM 128 128 32 32 32 32 32 32 4 64
FP32 CUDA Cores 16896 14592 6912 6912 5120 5120 3584 3584 3072 2880
FP64 CUDA Cores 16896 14592 3456 3456 2560 2560 1792 1792 96 960
Tensor Cores 528 456 432 432 640 640 N/A N/A N/A N/A
Texture Units 528 456 432 432 320 320 224 224 192 240
Boost Clock TBD TBD 1410 MHz 1410 MHz 1601 MHz 1530 MHz 1480 MHz 1329MHz 1114 MHz 875 MHz
TOPs (DNN/AI) 2000 TOPs
4000 TOPs
1600 TOPs
3200 TOPs
1248 TOPs
2496 TOPs with Sparsity
1248 TOPs
2496 TOPs with Sparsity
130 TOPs 125 TOPs N/A N/A N/A N/A
FP16 Compute 2000 TFLOPs 1600 TFLOPs 312 TFLOPs
624 TFLOPs with Sparsity
312 TFLOPs
624 TFLOPs with Sparsity
32.8 TFLOPs 30.4 TFLOPs 21.2 TFLOPs 18.7 TFLOPs N/A N/A
FP32 Compute 1000 TFLOPs 800 TFLOPs 156 TFLOPs
(19.5 TFLOPs standard)
156 TFLOPs
(19.5 TFLOPs standard)
16.4 TFLOPs 15.7 TFLOPs 10.6 TFLOPs 10.0 TFLOPs 6.8 TFLOPs 5.04 TFLOPs
FP64 Compute 60 TFLOPs 48 TFLOPs 19.5 TFLOPs
(9.7 TFLOPs standard)
19.5 TFLOPs
(9.7 TFLOPs standard)
8.2 TFLOPs 7.80 TFLOPs 5.30 TFLOPs 4.7 TFLOPs 0.2 TFLOPs 1.68 TFLOPs
Memory Interface 5120-bit HBM3 5120-bit HBM2e 6144-bit HBM2e 6144-bit HBM2e 4096-bit HBM2 4096-bit HBM2 4096-bit HBM2 4096-bit HBM2 384-bit GDDR5 384-bit GDDR5
Memory Size Up To 80 GB HBM3 @ 3.0 Gbps Up To 80 GB HBM2e @ 2.0 Gbps Up To 40 GB HBM2 @ 1.6 TB/s
Up To 80 GB HBM2 @ 1.6 TB/s
Up To 40 GB HBM2 @ 1.6 TB/s
Up To 80 GB HBM2 @ 2.0 TB/s
16 GB HBM2 @ 1134 GB/s 16 GB HBM2 @ 900 GB/s 16 GB HBM2 @ 732 GB/s 16 GB HBM2 @ 732 GB/s
12 GB HBM2 @ 549 GB/s
24 GB GDDR5 @ 288 GB/s 12 GB GDDR5 @ 288 GB/s
L2 Cache Size 51200 KB 51200 KB 40960 KB 40960 KB 6144 KB 6144 KB 4096 KB 4096 KB 3072 KB 1536 KB
TDP 700W 350W 400W 250W 250W 300W 300W 250W 250W 235W

NVIDIA H100 80 GB PCIe Accelerator With Hopper GPU Is Priced Over $30,000 US In Japan