merge sort analysis and its real time applications
TRANSCRIPT
1
Merge Sort Analysis and its Real-Time
Applications GROUP-4BE-3RD YEAR, SEM-5TH
Sarvajanik College of Engineering & Technology
Department of Computer Engineering (Shift-1)
2
140420107014 DUDHWALA FRANCY KALPESH 140420107015 DUMASIA YAZAD HOMYAR 140420107016 GAJJAR KARAN DEVENDRABHAI 140420107018 GOTAWALA VATSAL YAGNESH .
3
Introduction to sortingIntroduction to merge sort algorithm Complexity analysis of merge sort
algorithmReal-time application
CONTENTS : A GLIMPSE OF WHAT IS
TO COME
4Introduction to sorting Sorting refers to arranging a set of data in some logical order. For ex. A telephone directory can be considered as a list where each
record has three fields - name, address and phone number. Being unique, phone number can work as a key to locate any record in
the list. Sorting is among the most basic problems in algorithm design. We are given a sequence of items, each associated with a given key
value. And the problem is to rearrange the items so that they are in an increasing(or decreasing) order by key.
The methods of sorting can be divided into two categories:Internal SortingExternal Sorting
5Internal Sorting
If all the data that is to be sorted can be adjusted at a time in main memory, then internal sorting methods are used
•External SortingWhen the data to be sorted can’t be accommodated in the memory at the same time and some has to be kept in auxiliary memory, then external sorting methods are used.
NOTE: We will only consider External sorting
6Stable and Not Stable Sorting If a sorting algorithm, after sorting the contents, does not change the sequence of
similar content in which they appear, it is called stable sorting.
If a sorting algorithm, after sorting the contents, changes the sequence of similar content in which they appear, it is called unstable sorting.
7Efficiency of Sorting Algorithm The complexity of a sorting algorithm measures the running time of a
function in which n number of items are to be sorted. The choice of sorting method depends on efficiency considerations for
different problems. Tree most important of these considerations are:
The length of time spent by programmer in coding a particular sorting program. Amount of machine time necessary for running the program. The amount of memory necessary for running the program.
8Efficiency of Sorting Algorithm Various sorting methods are analyzed in the cases like – best case,
worst case or average case. Most of the sort methods we consider have requirements that range from
O(n logn) to O(n2). A sort should not be selected only because its sorting time is 0(nlogn);
the relation of the file size n and the other factors affecting the actual sorting time must be considered.
Determining the time requirement of sorting technique is to actually run the program and measure its efficiency.
Once a particular sorting technique is selected the need is to make the program as efficient as possible.
Any improvement in sorting time significantly affect the overall efficiency and saves a great deal of computer time.
9Efficiency of Sorting Algorithm
Space constraints are usually less important than time considerations.
The reason for this can be, as for most sorting programs, the amount of space needed is closer to 0(n) than to 0(n2)
The second reason is that, if more space is required, it can almost always be found in auxiliary storage.
10Introduction to merge sort algorithm Divide-and-conquer, breaks a problem into sub problems that are similar
to the original problem, recursively solves the sub problems, and finally combines the solutions to the sub problems to solve the original problem.
Think of a divide-and-conquer algorithm as having three parts: Divide the problem into a number of sub-problems that are smaller instances of the
same problem. Conquer the sub-problems by solving them recursively. If they are small enough, solve
the sub-problems as base cases. Combine the solutions to the sub-problems into the solution for the original problem. Because we're using divide-and-conquer to sort, we need to decide what our sub
problems are going to be.
Divide-and-conquer algorithms
11Problem
divideSub problem
Sub problem
Conquer Solve sub-problem
Solve sub-problem
Solutionto
Sub-problem
Solutionto
Sub-problem
Solution to problem
Combine
12 Merge sort algorithm
Merge sort is a sorting technique based on divide and conquer technique that was invented by John von Neumann in 1945.
Merge sort work on Two basic principle :• Sorting smaller list is faster than sorting larger list.• Combining two sorted sub lists is faster than
of two unsorted list.
13Working of merge sort
Merge sort works in three stage:• Divide : Merge sort first divides the list into equal halves and
then combines them in a sorted manner.• Conquer : Then sort the sub-lists• Combine : After sorting merge all sub-lists into single list.
1418 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
2618 6 32 1543 1 9
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 32 6 15 43 1 9
6 18 26 32 1 9 15 43
1 6 9 15 18 26 32 43
18 26
18 26
18 26
32
32
6
6
32 6
18 26 32 6
43
43
15
15
43 15
9
9
1
1
9 1
43 15 9 1
18 26 32 6 43 15 9 1
18
26 6 32
6 26 3218
15
43 1 9
1 9 15 43
1 6 9 1518 26 32 43
Original Sequence Sorted SequenceWorking of merge sort
15
43 15 9 1 22 26 19 55 37 43 99 2
18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2
18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2
18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2
18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2
18 26 32 6
16kn
kn/2 kn/2
kn/4 kn/4 kn/4 kn/4
1 1 1 1 1 1 1 1
kn
kn
kn
kn
Log2 n
Total : Log2 n + kn
17Algorithm for merge sort :
Algorithm MERGE_SORT(A,1,n)
//A is array of size n
if low < high then mid floor ( low + high ) / 2 MERGE_SORT(A , low , mid) MERGE_SORT(A , mid+1 , high) COMBINE(A , low, mid, high)end
Algorithm COMBINE(A , low , mid , high)
L1 mid – low +1 L2 high – midfor i 1 to L1 do LEFT[i] A [ low + i -1 ]endfor j 1 to L2 do RIGHT[ i ] A[ mid + j ]endLEFT [ L1 + 1] ∞RIGHT [ L2 + 1] ∞
i 1 , j 1for k low to high do if LEFT [ i ] RIGHT [ j ] then A[ k ] LEFT [ i ] i i +1 else A [] RIGHT [] j = j + 1 endendend
18Ex: sorting the list using merge sort algorithm.
33 22 44 00 99 88 110 1 42 653
33 22 44 00 99 88 11
1 11
33 22 44 00
33 22 44 00
22 33 00 44
00 22 33 44
99 88 11
8899
88 99
11
11
11 88 99
33 22 44 00 99 88 11
2 6
3
10
5
4 7
9
8
12
17
16
18
13
15
14
p r
19Merge sort works as follow:
MERGE_SORT(A , p , r):if p<r then, q=( r + p)/2
MERGE_SORT(A , p , q) MERGE_SORT(A , q+1 , r) COMBINE(A, p , q , r)
end Step- 0 : MERGE_SORT(A , 0 , 6)Step- 1 : MERGE_SORT(A , 0 , 3) Step-11 : MERGE_SORT(A , 4 , 6)Step- 2 : MERGE_SORT(A , 0 , 1) Step-12 : MERGE_SORT(A , 4 , 5)Step- 3 : MERGE_SORT(A , 0 , 0) Step-13 : MERGE_SORT(A , 4 , 4)Step- 4 : MERGE_SORT(A , 1 , 1) Step-14 : MERGE_SORT(A , 5 , 5)Step- 5 : COMBINE(A , 0 , 0 , 1) Step-15 : COMBINE(A , 4 , 4 , 4 ,
5)Step- 6 : MERGE_SORT(A , 2 , 3) Step-16 : MERGE_SORT(A , 6 , 6)Step-7 : MERGE_SORT(A , 2, 2) Step-17 : COMBINE(A , 4 , 5 , 6)Step-8 : MERGE_SORT(A , 3 , 3) Step-18 : COMBINE(A , 0 , 3 ,6)Step-9 : COMBINE(A , 2 , 2 , 3)Step-10 : COMBINE(A , 0 , 1 , 3)
20Example base on previous method to sorting the list using merge sort algorithm.
21Complexity analysis of merge sort algorithm
Divide : This step computes the middle of the sub array, which takes constant time . Thus, D(n) = θ(1).Conquer : We recursively sole two sub problems, each of size (n/2) , which contributes 2T(n/2) to the running time.Combine: Combine procedure on an n-element sub array takes times θ(n).so C(n) = θ(n).
T(n)=0 ,if n <=1
T(n/2) + T(n/2) + D(n) + C(n) , else
22T(n) = 2 * T(n/2) + θ(n) + θ(1)T(n) = 2 * T(n/2) + nT(n/2) = 2 * ( 2 * T(n/4) + n/2 )T(n) = 2 * ( 2 * T(n/2) + n/2 ) + n
= 22 * T(n/ 22) + 2n....=2k T(n/ 2k)+ kn
Suppose, n = 2k so k = log2 nT(2k) = n* T(n/n) + log2 n * n
=n *T(1) +n* log2 n But T(1)=0
T(n)= O(n* log2 n )
23
Weather list is already sorted , inverse sorted or randomly distributed , all three steps must be performed.Merge sort cannot identify if list is sorted. So numbers of comparisons are same for 3 case.
Worst case of merge sort takes les time compared to insertion sort , selection sort & bubble sort .However , best case of insertion sort (O(n)) beats all three cases of merge sort(O(nlog2n))
Best case Average case Worst caseO(nlog2n) O(nlog2n) O(nlog2n)
24Property of merge sort :
• Not Adaptive : Running time doesn’t change with data pattern.
• Stable/ Unstable : Both implementations are possible .• Not Incremental : Does not sort one by one element in
each pass.• Not online : Need all data to be in memory at the time
of sorting.• Not in place : It need O(n) extra space to sort two sub
list of size(n/2).
25Real-time application The e-commerce applicationHave you ever noticed on any e-commerce website, they have this section of "You might like", they have maintained an array for all the user accounts and then whichever has the least number of inversion with your array of choices, they start recommending what they have bought or they like. I am not going into the time and space complexity details of the algorithm. Obviously there are a lot of ways of doing this and this is one of them.
And some e-commerce website like policybazaar.com , trivago.com etc. use there search engine for collecting data from other website to provide minimum cost of booking hotel room or purchasing product .
26
Thank you