New Features〔Key highlights just from my perspective〕
- Streaming Multiprocessors(SM): Throughput per clock cycle for many integer arithmetic operations is doubled compared to NVIDIA Ada GPUs.
- Tensor cores:
- new FP4 capabilities
- new Second-Generation FP8 Transformer Engine
- GDDR7 Memory
- ultra-low voltage
- PAM3 (Pulse Amplitude Modulation) signaling technology
- higher-speed memory subsystems
Blackwell GB202 GPU
- 12 Graphics Processing Clusters (GPCs)
- 96 Texture Processing Clusters (TPCs)
- 192 Streaming Multiprocessors (SMs)
- a 512-bit memory interface with sixteen 32-bit memory controllers
- (GB202) the total L2 cache is 128 MB
- (RTX5090) the total L2 cache is 96 MB
 |
| Fig. 1 [GB202 GPU block diagram (full chip)] From NVIDIA RTX Blackwell GPU Architecture, ver. 1.1, NVIDIA Corp., 2025, p. 8. |
 |
Fig. 2 [The Blackwell GPC] From NVIDIA RTX Blackwell GPU Architecture, ver. 1.1, NVIDIA Corp., 2025, p. 9. - A GPU is made up of 12 GPCs ( as shown in Fig. 1 )
- Each GPC contains 8 TPCs ( as shown in Fig. 2 )
- Each TPC contains 2 SMs ( as shown in Fig. 2 )
- Each SM contains 128 CUDA cores ( as shown in Fig. 2 )
- 12 × 8 × 192 × 2 = 24576 CUDA cores
It is very interstng: The FP64 TFLOP rate is 1/64th the TFLOP rate of FP32 operations. The small number of FP64 Cores are included to ensure any programs with FP64 code operate correctly. Similarly, a very minimal number of FP64 Tensor Cores are included for program correctness. — NVIDIA Corporation, NVIDIA RTX Blackwell GPU Architecture, Version 1.1, p. 8 Fig. 3 [The Blackwell Streaming Multiprocessor (SM)] From NVIDIA RTX Blackwell GPU Architecture, ver. 1.1, NVIDIA Corp., 2025, p. 11. SM Architecture - 128 CUDA cores
- a 256 KB Register File
- 128 KB of L1/Shared Memory
- 4 Blackwell Fifth-Generation Tensor Cores
- 4 Texture Units
- 1 Blackwell Fourth-Generation RT Core
Blackwell 5th Generation Tensor Cores- FP 4, 6, 8, 16
- BF16
- TF32
- INT8
|
[reference]
NVIDIA Corporation. NVIDIA RTX Blackwell GPU Architecture. Version 1.1, 2025, https://images.nvidia.com/aem-dam/Solutions/geforce/blackwell/nvidia-rtx-blackwell-gpu-architecture.pdf.