-The basic idea of an ML compiler is to consider an ML model's execution graph as a program to compile, and produce an optimized set of GPU-specific instructions. The compiler can optimize the execution graph by doing things like fusing operations together, parallelizing operations when possible, and even mapping groups of operators to GPU-specific instructions. It can use its knowledge of the target GPU architecture to optimize the memory layout and parallelism of operations. Basically what compilers already do for CPUs today, but for GPUs.
0 commit comments