R.A. Namneh, W.D. Pan, and S.-M. Yoo
FFT, inter-processor communication, SMP, Beowulf cluster
Computing 1-D Fast Fourier Transform (FFT) using the conventional 4-step FFT on parallel computers requires intensive all-to- all communication, which can adversely affect the performance of FFT. In this paper, we present 2-step-no-communication and 3- step-no-communication algorithms, which are parallel algorithms for 1-D FFT without interprocessor communication. One of the main advantages of these algorithms is the absence of all-to-all communication between processors, albeit at the expense of increased computation compared to the conventional 4-step FFT. If the cost of extra computation required by the 2-step-no-communication and the 3-step-no-communication algorithms is more than offset by the cost of all-to-all communication in the 4-step FFT, then these two no-communication algorithms will outperform the 4-step FFT algorithm. We test the 2-step-no-communication and the 3-step-no- communication algorithms in two parallel systems (a 32-node Beowulf cluster and 8-node symmetric multiprocessors), with varying costs of all-to-all communication and computation. The experimental results show that the no-communication algorithms perform better than the 4-step FFT in the SMP only for relatively small data sizes, but the no-communication algorithms outperform the 4-step FFT in the Beowulf cluster for all data sizes tested.
Important Links:
Go Back