Skip to content

NikhilRout/TheTensorCoreProject

Repository files navigation

TheTensorCoreProject

Microarchitecture implementation of my interpretation of Nvidia's SIMT CUDA and Hybrid-Precision Tensor Cores, and Google's Systolic Array TPU MXU

Tensor Core Versions

TensorCore v0: Volta Architecture [FP16MUL FP32ADD]

Volta Tensor Core Architecture Diagram
Volta Tensor Core Architecture Diagram

TensorCore v1: Ampere Architecture [TF32MUL FP32ADD / BF16MUL FP32ADD] + Fine-Grained Structured Sparsity

Ampere Tensor Core Architecture Diagram
Ampere Tensor Core Architecture Diagram

TensorCore v2: Hopper Architecture [FP8(E5M2/E4M3)MUL FP16ADD]

Hopper Tensor Core Architecture Diagram