Open
Description
The switch to KA.jl significantly slowed down several operations.
CUDA.jl: permudetims
, broadcast
, and many others
https://speed.juliagpu.org/changes/?tre=10&rev=6221589f5befec8f6f157a5a5271667dba09d0b6&exe=11&env=1
Metal.jl: permudetims
private array/permutedims/4d 2911500 ns 860084 ns 3.39
private array/permutedims/2d 1065021 ns 862229.5 ns 1.24
private array/permutedims/3d 1629229 ns 919520.5 ns 1.77
shared array/permutedims/4d 2933000 ns 858875 ns 3.41
shared array/permutedims/2d 1054250 ns 862292 ns 1.22
shared array/permutedims/3d 1625958 ns 923916.5 ns 1.76
Metadata
Metadata
Assignees
Labels
No labels