Merge sort
My goal here is to give you a deeper understanding of how merge sort works so that the only thing you have to worry about is the syntax of your preferred language.
Merge sort is a divider-and-conquer algorithm where you divide a problem into smaller units to ease the effort needed to solve the problem. For example, if you are scheduled for a fight with five people all at once next week and recover their titles, you can go to each opponent's house, win them, take the title and present it on the day the fight was scheduled.
The main steps in merge sort
Divide the array into two halves
Recursively sort both halves ( You call the recursive method )
Merge the result ( You call the merge method )
Understanding
For easy understanding, I will be representing the full process with a tree that shows what is happening and the size of the array. Here the processes indicated by the blue ink happen first before the ones in black. Also, the first that is completed is the leftmost part which results in two separate arrays with 3 and 2 respectively.
NOTE: If you want to know how to draw the tree below for other recursive algorithms do let me know in the comment section.
In the image above, we are given an array [3, 2, 5, 7, 4, 1]. Notice the following points in the diagram above.
Recursion is only used to break the array into two parts till the smallest array is obtained while the merge method rearranges ( sorts ) the elements into a new array that is then sent up to be compared with the other half.
The three steps mentioned above will run for each section that the array is divided into and this division of the array will not stop until we have arrays that only contain one element. This implies that the first array is divided and the left half is picked and divided further into two.
So the process continues on the leftmost branch until we have an array that has two elements which at this point on further division, we will have two arrays with only one element. This is when the merging function comes in and combines the two arrays ensuring that the first element comes first and the second next. This is where the actual sorting happens.
Once this is done, the newly sorted array is sent up as the left array. This is when the process in step two is repeated for the right tree until a sorted array is ready and returned to the top.
At this point, we now have two sorted arrays and the merge method is called. Calling the merge method on each pair of sorted arrays continues until the very top of the tree is reached thus giving us a fully sorted array
Full code
The code below is the full implementation in Java. PLEASE try to understand the code as much as you can based on the explanation above before looking at the breakdown. This will help you identify things you thought you understood but you got it the wrong way.
package MergeSort;
import java.util.Arrays;
public class MergeSort {
public static void main (String[] args){
int[] arr = {2, 1, 5, 8, 3, 7};
int[] sortedArr = mergeSort(arr);
System.out.println(Arrays.toString(sortedArr));
}
static int[] mergeSort (int[] arr) {
if(arr.length == 1){
return arr;
}
int mid = arr.length / 2;
int[] left = mergeSort(Arrays.copyOfRange(arr, 0, mid));
int[] right = mergeSort(Arrays.copyOfRange(arr, mid, arr.length));
return merge(left, right);
}
static int[] merge(int[] left, int[] right){
int[] mix = new int[left.length + right.length];
int i = 0;
int j = 0;
int k = 0;
//while none of the array has not been exhausted,
while(i < left.length && j < right.length){
if(left[i] > right[j]){
mix[k] = right[j];
++j;
}else{
mix[k] = left[i];
++i;
}
++k;
}
//if the right array is empty and there are still some elements in the left array.
while(i < left.length){
mix[k] = left[i];
++i;
++k;
}
//if the left array is empty and there are still some elements in the right array.
while(j < right.length){
mix[k] = right[j];
++j;
++k;
}
return mix;
}
NOTE: Recursion is only used to break the array while a while loop is used to sort elements and merge the arrays. Therefore you will not use recursion while merging.
Break down
The first method after the main method is the mergeSort. This is where we keep making recursive calls until we have arrays with only one element.
static int[] mergeSort (int[] arr) { //base case if(arr.length == 1){ return arr; } //divide the array int mid = arr.length / 2; int[] left = mergeSort(Arrays.copyOfRange(arr, 0, mid)); int[] right = mergeSort(Arrays.copyOfRange(arr, mid, arr.length)); return merge(left, right); }
Once the base case is hit, the left array is stored in a variable called
left
and the right array is stored in a variable calledright
. Both are passed into the merge method where the sorting happens.The next thing is the initialization of variables.
int[] mix = new int[left.length + right.length]; int i = 0; int j = 0; int k = 0;
Here an array is initialized that can hold the combination of both arrays. That is, it has the size of the array above it on the tree diagram ( check diagram). Then we have variables
i
to keep track ofleft
array,j
to keep track ofright
array andk
to keep track ofmix
array.The next thing is a while loop that does the actual sorting.
//while none of the array has not been exhausted, while(i < left.length && j < right.length){ if(left[i] > right[j]){ mix[k] = right[j]; ++j; }else{ mix[k] = left[i]; ++i; } ++k; }
The while loop above says "While none of the arrays (
left
orright
) is finished, check if the first element in theleft
array is greater than the first element in theright
array. If so, take the smaller element (the first element inright
array) and set it as the first element in themix
array. otherwise, it means that the first element inleft
array is smaller and thus inserted intomix
array".Notice that in each block, the counters for the array that has just been used are increased by one so that the next time we come around, we will be looking at another element. After that, the mix array counter is increased by one to ensure that we fix the incoming element into a position that has not been used already.
The next thing is two while loops that ensure that once any half of the array is exhausted, the remaining elements in the other half are copied to the
mix
array.//if the right array is empty and there are still // some elements in the left array. while(i < left.length){ mix[k] = left[i]; ++i; ++k; } //if the left array is empty and there are still some //elements in the right array. while(j < right.length){ mix[k] = right[j]; ++j; ++k; }
- Lastly, we return the
mix
array which at this point contains all the elements in sorted order
- Lastly, we return the
return mix;
Big O —> O( NlogN )
The time complexity of the merge sort algorithm is O(nlogn), where n is the number of elements in the array being sorted.
Here's a brief explanation:
Divide Phase:
- The array is recursively divided into halves until each subarray contains a single element. This division takes O(logn) time because it essentially divides the array into 2k subarrays, where k is the number of recursive calls.
Conquer Phase (Merging):
- The merging of the subarrays takes O(n) time in the worst case. In each merging step, every element in the array is processed once.
The recurrence relation for the time complexity of merge sort can be expressed as:
T(n)=2T(2n)+O(n)**
Here the coefficient of T comes from the fact that for a complete round in the process, the will have two different arrays to merged and this merging will involve the full length of the initially provided array.
Solving this recurrence relation using the Master Theorem, we find that the time complexity is O(nlogn).
Merge sort has a consistent O(nlogn) time complexity regardless of the initial ordering of elements, making it a reliable choice for sorting large datasets. This means that the upper bound and the lower bound of an algorithm are the same.
However, it does require additional space for the temporary arrays during the merging phase, so the space complexity is O(n).
If you understood merge sort as explained here please leave a comment and if there are any improvement suggestions, please do let me know. Thank you for your time and don't forget to follow.