Some performance tips from @ggerganov https://x.com/ggerganov/status/1909657397964292209 we should turn these on by default for the cases described, CUDA, METAL, etc. @kpouget this might be up your alley :)