-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Both for CHIUW and an ultimate chplUltra paper (hopefully soon), it would be good to have a direct comparison of the performance of different distributed FFT implementations
- Chapel
- PFFT (https://github.com/mpip/pfft) (2D decomposition)
- FFTW (1D decomposition, MPI only)
- FFTW (1D decompositon, MPI+OpenMP -- I expect this not to scale well).
- P3DFFT (https://www.p3dfft.net/) (this just has real-to-complex and complex-to-real transforms, so requires closing Extend the FFTW interfaces to support R2C and C2R transforms #66 and the related topic in Future improvements #36)
We should do basic timing runs for all of these -- just back and forth FFTs, ideally exploring both the strong and weak scaling of these systems.
ronawho
Metadata
Metadata
Assignees
Labels
No labels