Sorting on a Massively Parallel System Using a Library of Basic Primitives: Modeling and Experimental Results
Alf Wachsmann Rolf Wanka DESY-IfH Zeuthen International Computer Science Institute 15735 Zeuthen 1947 Center Street Germany Berkeley, CA 94704, USA email: email@example.com email: firstname.lastname@example.org
We present a comparative study of implementations of the following sorting algorithms on the Parsytec SC320 reconfigurable, asynchronous, massively parallel MIMD machine:
- Bitonic Sort, Odd-Even Merge Sort, Odd-Even Merge Sort with guarded split&merge, and two variants of Samplesort.
The experiments are performed on 2- up to 5-dimensional wrapped butterfly networks with 8 up to 160 processors. We make use of library functions that provide primitives for global variables and synchronization, and we show that it is possible to implement efficient and portable programs easily. In order to predict the performance, we model the runtime of an access to a global variable by a certain trilinear function and the runtime of a synchronization by a certain bilinear function. Our experiments show that, in the context of parallel sorting, this model that can be applied easily is sufficiently detailed to give good runtime predictions. The experiments confirming the predictions point out that Odd-Even Merge Sort with guarded split&merge is the fastest method if the processors hold few keys. If there are many keys per processor, a combination of Samplesort and Odd-Even Merge Sort is the fastest method.
Paper in compressed Postscript format (gzip used):
A4 format US letter format
© Copyright 1997 Springer-Verlag. Published in the Proceedings of the 3rd European Conference in Parallel Processing (EURO-PAR), pp. 399-408, 1997.