-It's pretty clear that ML compilers are going to be a big deal. NVIDIA's TensorRT is also an ML compiler, but it only targets their GPUs. Once the generated machine code (from cross-vendor ML compilers) is comparable in performance to hand-tuned kernels, these compilers are going to break the (in)famous moat of CUDA. And thankfully, this will also finally make AMD's consumer GPUs more accessible to developers (by making AMD's terrible support for ROCm on consumer GPUs unnecessary). Yes, cheap shot, but I've lost a lot of hair trying to support AMD's consumer GPUs over the years.
0 commit comments