-The basic idea of an ML compiler is to treat an ML model's execution graph as a program to compile, and to produce an optimized set of GPU-specific instructions. The compiler can optimize the execution graph by doing things like fusing operations together, parallelizing operations when possible, and even mapping groups of operators to GPU-specific instructions. It can use its knowledge of the target GPU architecture to optimize the memory layout and parallelism of operations. Basically what compilers already do for CPUs today, but for GPUs.
0 commit comments