dual pivot quick sort

Upload: coeus-apollo

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Dual Pivot Quick Sort

    1/11

    Dual-Pivot Quicksort

    Vladimir [email protected]

    First revision: February 16, 2009

    Last updated: September 22, 2009

    Introduction

    Sorting data is one of the most fundamental problems in Computer Science,especially if the arranging objects are primitive ones, such as integers, bytes, floats, etc.Since sorting methods play an important role in the operation of computers and other data

    processing systems, there has been an interest in seeking new algorithms better than theexisting ones. We compare sorting methods by the number of the most expensiveoperations, which influence on effectiveness of the sorting techni!ues, " comparisonsand swaps. #uicksort algorithm is an effective and wide$spread sorting procedure with

    C*n*ln(n)operations, where n is the si%e of the arranged array. &he problem is to find analgorithm with the least coefficient C. &here were many attempts to improve the classicalvariant of the #uicksort algorithm'

    (. )ick an element, called apivot, from the array.

    *. +eorder the array so that all elements, which are less than the pivot, come before thepivot and all elements greater than the pivot come after it e!ual values can goeither way-. fter this partitioning, the pivot element is in its final position.

    /. +ecursively sort the sub$array of lesser elements and the sub$array of greater elements.

    0oare, 0elman, 1nuth, Sedgewick, 2entley and other scientists workedmostly on the effectiveness of the concrete divide and con!uer algorithmimplementations, or tried to increase performance due to the specific choice of the pivot

    element, but all of them used the classical partitioning scheme with twoparts.

    We can show that using twopivot elements or partitioning to threeparts- ismore effective, especially on large arrays. We suggest the new Dual-Pivot Quicksortscheme, faster than the known implementations, which improves this situation. &heimplementation of the 3ual$)ivot #uicksort algorithm has been investigated on differentinputs and primitive data types. Its advantages in comparison with one of the mosteffective known implementations of the classical #uicksort algorithm 4(5, 4*5, and

    implementation of #uicksort in 6317 8.9 platform 4/5 have been shown.

    :ew 3ual$)ivot #uicksort algorithm

    &he new #uicksort algorithm uses partitioning a source array T[] a,

    where T is primitive type such as int, float, byte, char, double, long and short-, to three

    parts defined by two pivot elements P1and P2and therefore, there are three pointers L,

    K, !and le"t and ri#ht" indices of the first and last elements respectively- shown in;igure ('

    1

    mailto:[email protected]:[email protected]
  • 8/10/2019 Dual Pivot Quick Sort

    2/11

    ;igure (.

    P1 < P1 P1

  • 8/10/2019 Dual Pivot Quick Sort

    3/11

    (- CFn B n$(- G (Hn sumFJkB9KLJn$(K JCFk G CFJn$k$(KK

    and last sum can be rewritten'

    *- CFn B n$(- G *Hn sumFJkB9KLJn$(K CFk

    Write formula *- for nG('

    /- CFJnG(K B n G *HnG(- sumFJkB9KLJnK CFk

    Aultiply *- by n and /- by nG(- and subtract one from other, we have'

  • 8/10/2019 Dual Pivot Quick Sort

    4/11

    &o find the approximation of the average number of swaps, we use the similar approachas in the case of comparisons. &he average number of swaps SFn as a function of thenumber of elements may be represented by the e!uation'

    (9- SFn B (H*n$(- G *Hn sumFJkB9KLJn$(K SFk

    We assume that average number of swaps during one iteration is (H*n$(-. It means thatin average one half of elements is swapped only. Osing the same approach, we find thatthe coefficient e!uals to (. &herefore, the function SFn may be approximated by thee!uation'

    ((- SFn B nlnn-

    :ow consider the 3ual$)ivot #uicksort scheme and find the average number ofcomparisons and swaps for it. We assume that input data is random permutation of nnumbers from the range 4(..n5.

    3ual$)ivot #uicksortBBBBBBBBBBBBBBBB(. Choose * pivot elements pivot( and pivot* take random-,*. pivot( must be less or e!ual than pivot*, otherwise they are swapped/. Compare each n$*- elements with the pivots and swap it, if necessary, to have the

    partitions' 4 B p( D p( B Q B p* D EB p* 5

    =. Sort recursively left, center and right parts.

    ;rom the algorithm above, the average number of comparisons CFn as a function of thenumber of elements may be represented by the e!uation'

    (- CFn B ( G *Hnn$(-- sumFJiB9KLJn$*K sumFJjBiG(KLJn$(K JCFi G (i G CFJj$i$(K G *j$i$(- G CFJn$j$(K G *n$j$(-K

    R!uation (- means that total number is the sum of the comparison numbers of all casesof partitions into / parts plus number of comparisons for elements from left part one

    comparison-, center and right parts * comparisons-.

    We will show that *Hnn$(-- sumFJiB9KLJn$*K sumFJjBiG(KLJn$(K Ji G *j$i$ (- G*n$j$(-K e!uals to =H/ n$*-. etNs consider the double sum'

    sumFJiB9KLJn$*K sumFJjBiG(KLJn$(K Ji G *j$i$ (- G *n$j$(-K B sumFJiB9KLJn$*KsumFJjBiG(KLJn$(K J*n $

  • 8/10/2019 Dual Pivot Quick Sort

    5/11

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$here we use the property' sumFJkB(KJnK JkL*K B nL/H/ G nL*H* G nH8$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    B *n$*-n$(-L* $ n$(-n$*-H*- $ n$(-n$(-n$*-H* G n$*-L/H/ G n$*-L*H* G n$*-H8 B (H8 (*n$(-L*n$*- $ 8n$(-n$*-L* $ /n$(-L*n$*- G *n$*-L/ G /n$*-L* G n$*-- B (H8 /n$(-n$*-/n$(- $ *n$*-- G n$*-*n$*-L* G /n$*- G(--- B (H8 /n$(-n$*-nG(- G n$*-*nL* $ =n G /-- B(H8 n$*-=nL* $ =n-B =H8 nn$(-n$*-

    Substitute the result into e!uation (-'

    CFn B ( G *Hnn$(-- =H8 nn$(-n$*- G *Hnn$(-- sumFJiB9KLJn$*K

    sumFJjBiG(KLJn$(K JCFi G CFJj$i$(K G CFJn$j$(KK

    or

    *- CFn B ( G =H/n$*- G *Hnn$(-- sumFJiB9KLJn$*K sumFJjBiG(KLJn$(K JCFi GCFJj$i$(K G CFJn$j$(KK

    &he double sum in e!uation *- can be reduced'

    /- CFn B ( G =H/n$*- G *Hnn$(-- sumFJkB9KLJn$*K J/ n$k$(- CFkK

    3enote ( G =H/n$*- by fFn and multiply to nn$(-'

  • 8/10/2019 Dual Pivot Quick Sort

    6/11

    Subtract ?- from @-, we have'

    P- TFJnG(K $ TFn B ;FJnG(K $ ;Fn G 8CFn

    +esolving of ;FJnG(K $ ;Fn gives'

    (9- TFJnG(K $ TFn B * G (9n G 8CFn

    &he function TFn is substituted into e!uation (9-, which yields the following e!uation'

    ((- nG(-nG*-CFJnG*K $*nnG(-CFJnG(K G nn$(- $ 8-CFn B * G (9n

    We will find the function CFn approximated by the e!uation'

    (*- CFn B nlnn-

    &he function CFn is substituted from e!uation (*- into e!uation ((-, which yields thefollowing e!uation'

    (/- nL/G=nL*G@nG

  • 8/10/2019 Dual Pivot Quick Sort

    7/11

    the coefficient e!uals to 9.@. &herefore, the function SFn may be approximated by thee!uation'

    ((- SFn B 9.@nlnn-

    nd as summary' the value of the coefficient '

    dual$pivot classiccomparison' *.9 *.9 swap' 9.@ (.9

    Comparison and summary

    &he new 3ual$)ivot #uicksort algorithm provides the following advantages'

    While sorting primitive objects, it is more efficient to use partitioning of unsortedarray to / parts instead of the usage of the classical approach. &he more the si%e of thearray to be sorted, the more efficiently the new algorithm works in comparison withthe classical #uicksort 4*5 and the #uicksort implemented in 631 8.9 platform 4/5.;or example, we provided the following numerical experiment' * 999 999 randominteger elements were sorted =9 times using the new 3ual$)ivot #uicksort, algorithms4*5 and 4/5, and analy%ed the calculation time. It took (8.=, (@.P, and *9./ seconds

    respectively. &he implementation of the new 3ual$)ivot #uicksort algorithm for

    integers can be easy adjusted for another numeric, string and comparable types. &he suggested 3ual$)ivot #uicksort algorithm also works !uicker than the classical

    schemes on the arranged arrays or the arrays with repeated elements. In these cases ofnonrandom inputs the time metric for the 3ual$)ivot #uicksort algorithm is == against(99 for #uicksort implemented in 631 8.9 platform, where all tests have been run on.

    &he supposed algorithm has additional improvement of the special choice procedure

    for pivot elements P1and P2. We take not the first a[le"t]and last a[ri#ht] elementsof the array, but choose two middle elements from = middle elements. &he describedmodification does not make worse the properties of the 3ual$)ivot #uicksortalgorithm in the case of random source data.

    We can summari%e the features of the suggested algorithm as follows' &ime savings. &he divide and con!uer strategy. Osage of two pivot elements instead of one. Aore effective sorting procedure that can be used in numerical analysis. &he 3ual$)ivot #uicksort algorithm could be recommended in the next 631 releases.

    7

  • 8/10/2019 Dual Pivot Quick Sort

    8/11

    Implementation in 6ava 7 programming language

    /*** @author Vladimir Yaroslavskiy* @version 2009.09.17 m765.817

    */u!li" "lass #ual$ivot%ui"ksort817 &

    u!li" stati" void sort'int() a & sort'a+ 0+ a.len,th-

    u!li" stati" void sort'int() a+ int romnde+ int tonde & ran,ehe"k'a.len,th+ romnde+ tonde- dual$ivot%ui"ksort'a+ romnde+ tonde 3 1-

    rivate stati" void ran,ehe"k'int len,th+ int romnde+ int tonde & i 'romnde 4 tonde &

    thro ne lle,alr,ument"etion'romnde' romnde 4 tonde' tonde - i 'romnde : 0 & thro ne rraynde;ut;

  • 8/10/2019 Dual Pivot Quick Sort

    9/11

    i 'a(mF) 4 a(m5) & = a(mF)- a(mF) = a(m5)- a(m5) = -

    // ivotsG ( : ivot1 H ivot1 := DD := ivot2 H 4 ivot2 ) int ivot1 = a(m2)- int ivot2 = a(mF)-

    !oolean di$ivots = ivot1 I= ivot2- a(m2) = a(let)- a(mF) = a(ri,ht)-

    // "enter art ointers int less = let 1- int ,reat = ri,ht 3 1-

    // sortin, i 'di$ivots & or 'int k = less- k := ,reat- k & = a(k)-

    i ' : ivot1 & a(k) = a(less)- a(less) = -

    else i ' 4 ivot2 & hile 'a(,reat) 4 ivot2 DD k : ,reat & ,reat33- a(k) = a(,reat)- a(,reat33) = - = a(k)-

    i ' : ivot1 & a(k) = a(less)- a(less) = - else & or 'int k = less- k := ,reat- k & = a(k)-

    i ' == ivot1 & "ontinue-

    i ' : ivot1 & a(k) = a(less)- a(less) = - else & hile 'a(,reat) 4 ivot2 DD k : ,reat & ,reat33- a(k) = a(,reat)- a(,reat33) = - = a(k)-

    i ' : ivot1 & a(k) = a(less)- a(less) = -

    9

  • 8/10/2019 Dual Pivot Quick Sort

    10/11

    // sa a(let) = a(less 3 1)- a(less 3 1) = ivot1-

    a(ri,ht) = a(,reat 1)- a(,reat 1) = ivot2-

    // let and ri,ht arts dual$ivot%ui"ksort'a+ let+ less 3 2- dual$ivot%ui"ksort'a+ ,reat 2+ ri,ht-

    // eJual elements i ',reat 3 less 4 len 3 #A>AB DD di$ivots & or 'int k = less- k := ,reat- k & = a(k)-

    i ' == ivot1 & a(k) = a(less)- a(less) = - else i ' == ivot2 & a(k) = a(,reat)- a(,reat33) = - = a(k)-

    i ' == ivot1 & a(k) = a(less)- a(less) = - // "enter art i 'di$ivots & dual$ivot%ui"ksort'a+ less+ ,reat-

    rivate stati" inal int #A>AB = 1E- rivate stati" inal int >?YAB = 17-

    Rxperimental results

    &he a lot of tests have been run, here is the time of executions'

    Server UA'http://spreadsheets.google.com/pub?key=t_EAWUkQ4mD3!b"#$%a&AQ'output=html

    Client UA'http://spreadsheets.google.com/pub?key=td()o$*le+*d,3-U"bc0Q's(-gle=true'g(d=0'output=html

    10

    http://spreadsheets.google.com/pub?key=t_EAWUkQ4mD3BIbOv8Fa-AQ&output=htmlhttp://spreadsheets.google.com/pub?key=tdiMo8xleTxd23nKUObcz0Q&single=true&gid=0&output=htmlhttp://spreadsheets.google.com/pub?key=tdiMo8xleTxd23nKUObcz0Q&single=true&gid=0&output=htmlhttp://spreadsheets.google.com/pub?key=t_EAWUkQ4mD3BIbOv8Fa-AQ&output=htmlhttp://spreadsheets.google.com/pub?key=tdiMo8xleTxd23nKUObcz0Q&single=true&gid=0&output=htmlhttp://spreadsheets.google.com/pub?key=tdiMo8xleTxd23nKUObcz0Q&single=true&gid=0&output=html
  • 8/10/2019 Dual Pivot Quick Sort

    11/11

    +eferences

    1. Classical #uicksort' http://en.wikipedia.org/wiki/Quik!ort

    2. Sample'http://www.ro!eindia.net/"a#a/$eginner!/arra%e&a'p(e!/Quik)ort.!ht'(

    /. 6on . 2entley, A. 3ouglas AcIlroy. Rngineering a Sort ;unction.

    SoftwareV)ractice and Rxperience, Uol.*/ ((-, ).(*