Add numpy benchmarks to fft and blas. #80
Add numpy benchmarks to fft and blas. #80pavanky merged 3 commits intoarrayfire:develfrom drufat:numpy
Conversation
|
@drufat calling The proper way to benchmark it would be to call the functions inside |
|
Can you also change |
|
@pavanky My understanding is that the purpose of running multiple iterations in a benchmark is to reduce the variability in the result - the actual computation should not be modified and the comparison should still be based on the time it takes for a single run to complete. If Here is a comparison of the run times on my computer for the old version and the new version. old bench_blas.py new bench_blas.py |
|
@drufat You don't need to call If you are calling it in every iteration, it is going to increase the overhead. For blas, this is a smaller overhead (about 5-10% at smaller sizes), but it is going to be higher for FFT. |
|
Here are the benchmarks for the fft old bench_fft.py new bench_fft.py |
|
@drufat It's still significantly slower at smaller sizes. Besides, always calling sync gives the wrong impression about how things should be timed on the GPU. |
…c() is run only once.
|
Alright. I switched the code back to the time module. Now |
Compares the following functions
Also switch to timeit for benchmark measurements. timeit turns off the python garbage collection during measurements, so the results may be more accurate.