- #1
pc2-brazil
- 205
- 3
Good morning,
I implemented the Binary Insertion Sort algorithm in C++, and now I want to analyse its time complexity. This is the Insertion Sort algorithm, but the linear search technique (which is used for locating the position in which to insert a particular integer) is replaced by a binary search technique.
I will try to do a detailed analysis of the number of comparisons made by the algorithm during its runtime. I should point out that this is not a homework assignment: this is for self-study purposes. I am posting it in this forum because I want to know if my reasoning is correct, if there are any inconsistencies or mistakes. This is a very long post, so I will appreciate if you take your time and have enough patience to analyse it carefully. I tried to be as clear as possible, but also to make the analysis succint (in order not to make this post longer than it could be).
The algorithm is the following:
The variable list is a vector of ints with n elements that need to be sorted (in this case, n = 7, and the list has elements arbitrarily chosen).
There is an outer loop through the list (with j ranging from 1 to n - 1). Inside this outer loop, there are two loops: a while loop for the binary search (which serves to search for the position in which to place the jth element) and a for loop (to swap the positions of the integers and place the jth integer in the correct position).
First, I will start by analysing the complexity of the binary search portion, in terms of comparisons made. Then, I will analyse the complexity of the code that swaps the integers.
Binary search portion
In each iteration of the outer loop, the binary search will be applied to a list containing the elements 0 to j. That is, a list containing j + 1 elements.
I will simplify the analysis by splitting it in two cases: 1) the number of elements in the list (j + 1) is a power of 2 and 2) j + 1 is not a power of two.
First case
In the first case, I will express the number of elements as [itex]j+1=2^k[/itex], where k is a positive integer and [itex]k=\log_2 (j+1)[/itex].
In the while loop of the binary search, the number of elements in the list will be succesively divided by 2 until it is equal to 1. Thus, in this case, because the number of elements is [itex]2^k[/itex], there will be k iterations of the while loop.
In each iteration of this loop, two comparisons are made: one to enter the loop (to verify whether left < right) and one inside the loop (to verify if list[j] > list[m]). That gives 2k comparisons for all iterations. But there is still one more comparison in order to leave the loop (when it is verified that left = right). That gives a total of [itex]2k + 1=2\log (j+1)+1[/itex] comparisons.
As one binary search will be performed for each iteration of the outer loop, this number of comparisons will be repeated with j = 1, 2, 3, ..., n-1. Therefore, the total number of comparisons of the binary search will be
[tex]\sum_{j=1}^{n-1}[2\log (j+1)+1]=2\sum_{j=1}^{n-1}\log (j+1)+\sum_{j=1}^{n-1}1=2(\log2+\log3+...+\log n)+n-1=2\log n!+n-1[/tex]
But [itex]2\log n!+n-1[/itex] is [itex]O(n\log n)[/itex]. Therefore, the time complexity of this binary search portion is [itex]O(n\log n)[/itex].
Second case
In the second case, where the number of elements on which to perform the binary search is not a power of two, I will increase the number of elements in the list to [itex]2^{k+1}[/itex], where [itex]k=\lfloor\log_2 (j+1)\rfloor[/itex]), to make it a power of two and thus simplify the calculations.
This case is much similar to the first case. The number of elements in the list will also be succesively divided by 2 until it is equal to 1. Then, as there are [itex]2^{k+1}[/itex] items in the list, the number of iterations of the loop will be k+1 (it was only k in the other case). As there are two comparisons for each iteration plus one to leave the loop, there will be [itex]2(k+1)+1=2k+3=2\lfloor\log_2 (j+1)\rfloor+3[/itex].
As before, the search will be repeated with j = 1, 2, 3, ..., n-1. Thus:
[tex]\sum_{j=1}^{n-1}[2\lfloor\log_2 (j+1)\rfloor+3]\leq\sum_{j=1}^{n-1}[2\log (j+1)+3]=2\log n!+3(n-1)[/tex]
which is also [itex]O(n\log n)[/itex]. Then, just like the other case, the complexity will also be [itex]O(n\log n)[/itex].
I should point out now that the number of comparisons of the binary search in this algorithm is always [itex]O(n\log n)[/itex]; there is no worst or best case scenario.
Portion that swaps the integers
I am referring to the if statement (if i != j) with a for loop inside, below the comment line that says "//swap".
This loop depends on the variable i, which indicates the new position in which to insert the jth integer.
I will consider two cases: best case scenario and worst case scenario.
The best case is when the list is already sorted (that is, it is in nonincreasing order). If that's so, i will always equal j (because the position of the integer doesn't need to be changed). Thus, the for loop will only verify if i != j, and then it will stop (because i will always equal j, as I said). Because this comparison is performed with j = 1, 2, 3, ..., n-1, (n - 1) comparisons will be performed, and, thus, the complexity of the swap will be O(n).
The worst case is when the list is in strictly decreasing order. In that case, i will always equal zero (because each jth integer will need to be swapped to the beginning of the list). Thus, the for loop will be performed from k = 0 to k = j-1. Thus, j iterations of this loop will be performed. As there is one comparison in the loop (which verifies whether k <= j - i - 1), there will be j comparisons for each j with j = 1, 2, 3, ..., n-1. Of course, I will have to add one more comparison (needed to leave the loop) and the (n - 1) comparisons performed by the if clause (that also appeared in the best case). The total of comparisons due to the loop will be:
[tex]n+\sum_{j=1}^{n-1}j=\frac{(n-1)n}{2}+n[/tex]
(the "n" that I added is because of the comparison needed to leave the loop and the (n - 1) comparisons by the if clause: n - 1 + 1 = n). Thus, in this case, the number of comparisons of the swapping code is O(n²).
Conclusion
The complexity of the binary search is O(nlogn).
The swap, on the other hand, has complexity O(n) on the best case and O(n²) on the worst case.
Thus, in the best case of the Binary Insertion Sort as a whole, its total number of comparisons is O(nlogn) (because nlogn of the binary search grows faster than n of the swap), and, in the worst case, its total number of comparisons is O(n²) (because n² of the swap grows faster than nlogn of the binary search).
I should remark that, in the number of comparisons of this algorithm, I didn't include the comparisons performed by the outer loop (the verifications of whether j < n of the outer for loop), but it wouldn't make difference for the big O estimate.
Thank you in advance.
I implemented the Binary Insertion Sort algorithm in C++, and now I want to analyse its time complexity. This is the Insertion Sort algorithm, but the linear search technique (which is used for locating the position in which to insert a particular integer) is replaced by a binary search technique.
I will try to do a detailed analysis of the number of comparisons made by the algorithm during its runtime. I should point out that this is not a homework assignment: this is for self-study purposes. I am posting it in this forum because I want to know if my reasoning is correct, if there are any inconsistencies or mistakes. This is a very long post, so I will appreciate if you take your time and have enough patience to analyse it carefully. I tried to be as clear as possible, but also to make the analysis succint (in order not to make this post longer than it could be).
The algorithm is the following:
Code:
#include <iostream>
#include <cmath>
using namespace std;
int main() {
int list[] = {3, 1, 4, 888, 999, 45, 3};
int n = 7;
int m;
for (int j = 1; j < n; j++) { //list[j] is the jth integer of the list
int i = 0;
int left, right, m, s;
//binary search
left = 0;
right = j;
while (left < right) {
m = floor((left + right)/2);
if (list[j] > list[m]) left = m + 1; else right = m;
}
i = left; //i is the index of the new position of list[j]
s = list[j]; //save list[j]
//swap
if (i != j){
for (int k = 0; k <= j - i - 1; k++){
list[j-k] = list[j-k-1];}
list[i] = s;
}
}
return 0;
}
There is an outer loop through the list (with j ranging from 1 to n - 1). Inside this outer loop, there are two loops: a while loop for the binary search (which serves to search for the position in which to place the jth element) and a for loop (to swap the positions of the integers and place the jth integer in the correct position).
First, I will start by analysing the complexity of the binary search portion, in terms of comparisons made. Then, I will analyse the complexity of the code that swaps the integers.
Binary search portion
In each iteration of the outer loop, the binary search will be applied to a list containing the elements 0 to j. That is, a list containing j + 1 elements.
I will simplify the analysis by splitting it in two cases: 1) the number of elements in the list (j + 1) is a power of 2 and 2) j + 1 is not a power of two.
First case
In the first case, I will express the number of elements as [itex]j+1=2^k[/itex], where k is a positive integer and [itex]k=\log_2 (j+1)[/itex].
In the while loop of the binary search, the number of elements in the list will be succesively divided by 2 until it is equal to 1. Thus, in this case, because the number of elements is [itex]2^k[/itex], there will be k iterations of the while loop.
In each iteration of this loop, two comparisons are made: one to enter the loop (to verify whether left < right) and one inside the loop (to verify if list[j] > list[m]). That gives 2k comparisons for all iterations. But there is still one more comparison in order to leave the loop (when it is verified that left = right). That gives a total of [itex]2k + 1=2\log (j+1)+1[/itex] comparisons.
As one binary search will be performed for each iteration of the outer loop, this number of comparisons will be repeated with j = 1, 2, 3, ..., n-1. Therefore, the total number of comparisons of the binary search will be
[tex]\sum_{j=1}^{n-1}[2\log (j+1)+1]=2\sum_{j=1}^{n-1}\log (j+1)+\sum_{j=1}^{n-1}1=2(\log2+\log3+...+\log n)+n-1=2\log n!+n-1[/tex]
But [itex]2\log n!+n-1[/itex] is [itex]O(n\log n)[/itex]. Therefore, the time complexity of this binary search portion is [itex]O(n\log n)[/itex].
Second case
In the second case, where the number of elements on which to perform the binary search is not a power of two, I will increase the number of elements in the list to [itex]2^{k+1}[/itex], where [itex]k=\lfloor\log_2 (j+1)\rfloor[/itex]), to make it a power of two and thus simplify the calculations.
This case is much similar to the first case. The number of elements in the list will also be succesively divided by 2 until it is equal to 1. Then, as there are [itex]2^{k+1}[/itex] items in the list, the number of iterations of the loop will be k+1 (it was only k in the other case). As there are two comparisons for each iteration plus one to leave the loop, there will be [itex]2(k+1)+1=2k+3=2\lfloor\log_2 (j+1)\rfloor+3[/itex].
As before, the search will be repeated with j = 1, 2, 3, ..., n-1. Thus:
[tex]\sum_{j=1}^{n-1}[2\lfloor\log_2 (j+1)\rfloor+3]\leq\sum_{j=1}^{n-1}[2\log (j+1)+3]=2\log n!+3(n-1)[/tex]
which is also [itex]O(n\log n)[/itex]. Then, just like the other case, the complexity will also be [itex]O(n\log n)[/itex].
I should point out now that the number of comparisons of the binary search in this algorithm is always [itex]O(n\log n)[/itex]; there is no worst or best case scenario.
Portion that swaps the integers
I am referring to the if statement (if i != j) with a for loop inside, below the comment line that says "//swap".
This loop depends on the variable i, which indicates the new position in which to insert the jth integer.
I will consider two cases: best case scenario and worst case scenario.
The best case is when the list is already sorted (that is, it is in nonincreasing order). If that's so, i will always equal j (because the position of the integer doesn't need to be changed). Thus, the for loop will only verify if i != j, and then it will stop (because i will always equal j, as I said). Because this comparison is performed with j = 1, 2, 3, ..., n-1, (n - 1) comparisons will be performed, and, thus, the complexity of the swap will be O(n).
The worst case is when the list is in strictly decreasing order. In that case, i will always equal zero (because each jth integer will need to be swapped to the beginning of the list). Thus, the for loop will be performed from k = 0 to k = j-1. Thus, j iterations of this loop will be performed. As there is one comparison in the loop (which verifies whether k <= j - i - 1), there will be j comparisons for each j with j = 1, 2, 3, ..., n-1. Of course, I will have to add one more comparison (needed to leave the loop) and the (n - 1) comparisons performed by the if clause (that also appeared in the best case). The total of comparisons due to the loop will be:
[tex]n+\sum_{j=1}^{n-1}j=\frac{(n-1)n}{2}+n[/tex]
(the "n" that I added is because of the comparison needed to leave the loop and the (n - 1) comparisons by the if clause: n - 1 + 1 = n). Thus, in this case, the number of comparisons of the swapping code is O(n²).
Conclusion
The complexity of the binary search is O(nlogn).
The swap, on the other hand, has complexity O(n) on the best case and O(n²) on the worst case.
Thus, in the best case of the Binary Insertion Sort as a whole, its total number of comparisons is O(nlogn) (because nlogn of the binary search grows faster than n of the swap), and, in the worst case, its total number of comparisons is O(n²) (because n² of the swap grows faster than nlogn of the binary search).
I should remark that, in the number of comparisons of this algorithm, I didn't include the comparisons performed by the outer loop (the verifications of whether j < n of the outer for loop), but it wouldn't make difference for the big O estimate.
Thank you in advance.