a comparison of parallel sorting algorithms on different architectures nancy m. amato, ravishankar...

26
A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University Hoang Bui CMPS 5443

Upload: clyde-armstrong

Post on 21-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

A Comparison of Parallel Sorting Algorithms on Different ArchitecturesNancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan WuTexas A&M University

Hoang Bui

CMPS 5443

Page 2: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Overview Introduction Machine Descriptions Parallel Sorting Algorithms Experimental Results Comparisons and Recommendations

Page 3: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Introduction Sorting is one of the fundamental problems. Sequential sorting and Parallel sorting. Variety of parallel architectures. Experimental study of three algorithms:

Bitonic sort, sample sort & parallel radix sort. Apply to three different machines.

Page 4: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Machine Descriptions The MasPar MP1201. The nCUBE 2. The Sequent Balance.

Page 5: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The MasPar MP1201 SIMD machine. 2,048 processors, 1.8 MIPS, 16Kbytes RAM. Mesh based architecture. X-Net and Global Router for communication. Programming languages: C-Like and Fortran.

Page 6: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The nCUBE 2 MIMD machine. 64 processors, 7.5 MIPS 1Mbyte RAM. Hypercube architecture. Broadcasting. Language: C.

Page 7: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The Sequent Balance MIMD. Shared Memory 10 processors, 8 Kbytes RAM. Communicate through shared memory. Running Unix, language: C.

Page 8: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Parallel Sorting Algorithms Bitonic Sort. Sample Sort. Parallel Radix Sort.

Page 9: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Bitonic Sort Bitonic Sequence:

Is the concatenation of an ascending and descending sequence of numbers.

Example: 2,4,6,8,9,24,6,3,2,0. Algorithm:

Convert n numbers into a bitonic sequence with n/2 numbers in an increasing subsequence.

Merge into an ordered sequence (increasing or decreasing.)

Page 10: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Sample sort Algorithm:

Select p-1 splitters p buckets. Each number is put into the appropriate bucket. Sort each buckets.

Page 11: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Parallel Radix Sort Example list:

170, 45, 75, 90, 2, 24, 802, 66 Algorithm:

sorting by least significant digit (1s place) gives:170, 90, 2, 802, 24, 45, 75, 66.

sorting by next digit (10s place) gives:2, 802, 24, 45, 66, 170, 75, 90 .

sorting by most significant digit (100s place) gives:2, 24, 45, 66, 75, 90, 170, 802

Page 12: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Experimental Results Make modifications to each machine. Code was written in C. Keys are randomly generated 32 bit integers. Repeat 25 times and calculate the average.

Page 13: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The MasPar MP1202 – Bitonic sort Good speed-ups for all input size.

Page 14: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The MasPar MP1202 – Sample sort Also good speed-ups.

Page 15: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The MasPar MP1202 – Radix sort Shortest time for the MasPar MP1202.

Page 16: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The MasPar MP1202 Comparisons:

Page 17: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The nCUBE 2 – Bitonic Sort

Page 18: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The nCUBE 2 – Sample Sort

Page 19: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The nCUBE 2 – Radix Sort

Page 20: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The nCUBE 2

Page 21: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The Sequent Balance – Bitonic sort

Page 22: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The Sequent Balance – Sample sort

Page 23: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The Sequent Balance – Radix sort

Page 24: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

The Sequent Balance

Page 25: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Comparisons and Recommendations MasPar MP1202:

Bitonic Sort for smaller input size. Parallel Radix Sort for larger input size.

nCUBBE 2: Sample Sort is the best.

Sequent Balance: Sample Sort for smaller input size. Parallel Radix Sort for larger input size.

Page 26: A Comparison of Parallel Sorting Algorithms on Different Architectures Nancy M. Amato, Ravishankar Iyer, Sharad Sundaresan and Yan Wu Texas A&M University

Question