Replies: 2 comments 1 reply
-
With your newest improvement, my inference time is reduced from 54-58s/step down to 44-48/step. Windows 10 laptop; Intel i5-8250U; 12Gb RAM 2600 Mhz; generating 512 x 512 image I'm using your windows ready avx2 version |
Beta Was this translation helpful? Give feedback.
-
Hey @leejet Using BLAS 44-48s/step, but w/o BLAS it's 34-38s/step. I wonder how could this happen? Is BLAS itself not very good on sd.cpp? Ubuntu Jammy 22.04 laptop; Intel i5-8250U; 12Gb RAM 2600 Mhz; generating 512 x 512 image |
Beta Was this translation helpful? Give feedback.
-
Original
./bin/sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat" -v
Improvement 1
./bin/sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat" -v
Improvement 2
./bin/sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat" -v
Beta Was this translation helpful? Give feedback.
All reactions