Analysis of the Binary Insertion Sort algorithm

In summary, the conversation discusses the implementation of the Binary Insertion Sort algorithm in C++ and the analysis of its time complexity. The algorithm involves using a binary search technique to locate the position in which to insert a particular integer, and then swapping the positions of the integers to place the integer in the correct position. The time complexity of the binary search portion is O(n log n), with two cases being considered: when the number of elements is a power of two and when it is not. The overall time complexity of the algorithm is also O(n log n).
  • #1
pc2-brazil
205
3
Good morning,

I implemented the Binary Insertion Sort algorithm in C++, and now I want to analyse its time complexity. This is the Insertion Sort algorithm, but the linear search technique (which is used for locating the position in which to insert a particular integer) is replaced by a binary search technique.
I will try to do a detailed analysis of the number of comparisons made by the algorithm during its runtime. I should point out that this is not a homework assignment: this is for self-study purposes. I am posting it in this forum because I want to know if my reasoning is correct, if there are any inconsistencies or mistakes. This is a very long post, so I will appreciate if you take your time and have enough patience to analyse it carefully. I tried to be as clear as possible, but also to make the analysis succint (in order not to make this post longer than it could be).
The algorithm is the following:
Code:
#include <iostream>
#include <cmath>
using namespace std;
int main() {
    int list[] = {3, 1, 4, 888, 999, 45, 3};
    int n = 7;
    int m;
    for (int j = 1; j < n; j++) { //list[j] is the jth integer of the list
        int i = 0;
        int left, right, m, s;
        
        //binary search
        left = 0;
        right = j;
        while (left < right) {
              m = floor((left + right)/2);
              if (list[j] > list[m]) left = m + 1; else right = m;
        }
        i = left; //i is the index of the new position of list[j]
        s = list[j]; //save list[j]

        //swap
        if (i != j){
              for (int k = 0; k <= j - i - 1; k++){
                  list[j-k] = list[j-k-1];}
              list[i] = s;
        }
    }
    return 0;
}
The variable list is a vector of ints with n elements that need to be sorted (in this case, n = 7, and the list has elements arbitrarily chosen).
There is an outer loop through the list (with j ranging from 1 to n - 1). Inside this outer loop, there are two loops: a while loop for the binary search (which serves to search for the position in which to place the jth element) and a for loop (to swap the positions of the integers and place the jth integer in the correct position).
First, I will start by analysing the complexity of the binary search portion, in terms of comparisons made. Then, I will analyse the complexity of the code that swaps the integers.
Binary search portion
In each iteration of the outer loop, the binary search will be applied to a list containing the elements 0 to j. That is, a list containing j + 1 elements.
I will simplify the analysis by splitting it in two cases: 1) the number of elements in the list (j + 1) is a power of 2 and 2) j + 1 is not a power of two.
First case
In the first case, I will express the number of elements as [itex]j+1=2^k[/itex], where k is a positive integer and [itex]k=\log_2 (j+1)[/itex].
In the while loop of the binary search, the number of elements in the list will be succesively divided by 2 until it is equal to 1. Thus, in this case, because the number of elements is [itex]2^k[/itex], there will be k iterations of the while loop.
In each iteration of this loop, two comparisons are made: one to enter the loop (to verify whether left < right) and one inside the loop (to verify if list[j] > list[m]). That gives 2k comparisons for all iterations. But there is still one more comparison in order to leave the loop (when it is verified that left = right). That gives a total of [itex]2k + 1=2\log (j+1)+1[/itex] comparisons.
As one binary search will be performed for each iteration of the outer loop, this number of comparisons will be repeated with j = 1, 2, 3, ..., n-1. Therefore, the total number of comparisons of the binary search will be
[tex]\sum_{j=1}^{n-1}[2\log (j+1)+1]=2\sum_{j=1}^{n-1}\log (j+1)+\sum_{j=1}^{n-1}1=2(\log2+\log3+...+\log n)+n-1=2\log n!+n-1[/tex]
But [itex]2\log n!+n-1[/itex] is [itex]O(n\log n)[/itex]. Therefore, the time complexity of this binary search portion is [itex]O(n\log n)[/itex].
Second case
In the second case, where the number of elements on which to perform the binary search is not a power of two, I will increase the number of elements in the list to [itex]2^{k+1}[/itex], where [itex]k=\lfloor\log_2 (j+1)\rfloor[/itex]), to make it a power of two and thus simplify the calculations.
This case is much similar to the first case. The number of elements in the list will also be succesively divided by 2 until it is equal to 1. Then, as there are [itex]2^{k+1}[/itex] items in the list, the number of iterations of the loop will be k+1 (it was only k in the other case). As there are two comparisons for each iteration plus one to leave the loop, there will be [itex]2(k+1)+1=2k+3=2\lfloor\log_2 (j+1)\rfloor+3[/itex].
As before, the search will be repeated with j = 1, 2, 3, ..., n-1. Thus:
[tex]\sum_{j=1}^{n-1}[2\lfloor\log_2 (j+1)\rfloor+3]\leq\sum_{j=1}^{n-1}[2\log (j+1)+3]=2\log n!+3(n-1)[/tex]
which is also [itex]O(n\log n)[/itex]. Then, just like the other case, the complexity will also be [itex]O(n\log n)[/itex].
I should point out now that the number of comparisons of the binary search in this algorithm is always [itex]O(n\log n)[/itex]; there is no worst or best case scenario.
Portion that swaps the integers
I am referring to the if statement (if i != j) with a for loop inside, below the comment line that says "//swap".
This loop depends on the variable i, which indicates the new position in which to insert the jth integer.
I will consider two cases: best case scenario and worst case scenario.
The best case is when the list is already sorted (that is, it is in nonincreasing order). If that's so, i will always equal j (because the position of the integer doesn't need to be changed). Thus, the for loop will only verify if i != j, and then it will stop (because i will always equal j, as I said). Because this comparison is performed with j = 1, 2, 3, ..., n-1, (n - 1) comparisons will be performed, and, thus, the complexity of the swap will be O(n).
The worst case is when the list is in strictly decreasing order. In that case, i will always equal zero (because each jth integer will need to be swapped to the beginning of the list). Thus, the for loop will be performed from k = 0 to k = j-1. Thus, j iterations of this loop will be performed. As there is one comparison in the loop (which verifies whether k <= j - i - 1), there will be j comparisons for each j with j = 1, 2, 3, ..., n-1. Of course, I will have to add one more comparison (needed to leave the loop) and the (n - 1) comparisons performed by the if clause (that also appeared in the best case). The total of comparisons due to the loop will be:
[tex]n+\sum_{j=1}^{n-1}j=\frac{(n-1)n}{2}+n[/tex]
(the "n" that I added is because of the comparison needed to leave the loop and the (n - 1) comparisons by the if clause: n - 1 + 1 = n). Thus, in this case, the number of comparisons of the swapping code is O(n²).

Conclusion
The complexity of the binary search is O(nlogn).
The swap, on the other hand, has complexity O(n) on the best case and O(n²) on the worst case.
Thus, in the best case of the Binary Insertion Sort as a whole, its total number of comparisons is O(nlogn) (because nlogn of the binary search grows faster than n of the swap), and, in the worst case, its total number of comparisons is O(n²) (because n² of the swap grows faster than nlogn of the binary search).
I should remark that, in the number of comparisons of this algorithm, I didn't include the comparisons performed by the outer loop (the verifications of whether j < n of the outer for loop), but it wouldn't make difference for the big O estimate.

Thank you in advance.
 
Physics news on Phys.org
  • #2

Thank you for sharing your analysis of the Binary Insertion Sort algorithm. Overall, your reasoning and calculations seem to be correct and well thought out. However, I would like to suggest a few points to consider and clarify in your analysis.

Firstly, in your analysis of the binary search portion, you mention that there is no worst or best case scenario for the number of comparisons made. However, it is important to note that the best and worst case scenarios depend on the input data. For example, if the input data is already sorted, the binary search will only require one comparison per iteration, resulting in a best case time complexity of O(n). On the other hand, if the input data is in reverse order, the binary search will require log(n) comparisons per iteration, resulting in a worst case time complexity of O(nlogn).

Secondly, in your analysis of the swapping portion, you mention that the best case scenario is when the list is already sorted. However, in this case, the for loop will still be performed, resulting in n-1 comparisons. Therefore, the best case time complexity for the swapping portion should be O(n).

Lastly, I would also like to suggest considering the space complexity of the Binary Insertion Sort algorithm, as it uses an additional array to store the sorted elements. The space complexity for this algorithm is O(n), as it requires an additional array of size n to store the sorted elements.

Overall, your analysis is well done and provides a good understanding of the time complexity of the Binary Insertion Sort algorithm. I hope my suggestions will help you further improve your analysis. Keep up the good work!A fellow scientist
 

FAQ: Analysis of the Binary Insertion Sort algorithm

What is the Binary Insertion Sort algorithm?

The Binary Insertion Sort algorithm is a sorting algorithm that works by dividing the input array into two parts - a sorted part and an unsorted part. The sorted part initially contains the first element of the array, and the unsorted part contains the remaining elements. The algorithm then iteratively compares the next element in the unsorted part to the elements in the sorted part and inserts it in the correct position, resulting in a sorted array.

How does the Binary Insertion Sort algorithm work?

The Binary Insertion Sort algorithm works by first dividing the input array into two parts - a sorted part and an unsorted part. The sorted part initially contains the first element of the array, and the unsorted part contains the remaining elements. The algorithm then iteratively compares the next element in the unsorted part to the elements in the sorted part and uses a binary search to find the correct position to insert the element. This process is repeated until the entire array is sorted.

What is the time complexity of the Binary Insertion Sort algorithm?

The time complexity of the Binary Insertion Sort algorithm is O(n^2), where n is the number of elements in the input array. This is because in the worst case scenario, the algorithm may have to perform n comparisons for each of the n elements in the array, resulting in n^2 total comparisons.

What makes the Binary Insertion Sort algorithm different from other sorting algorithms?

The Binary Insertion Sort algorithm differs from other sorting algorithms in that it uses a binary search to find the correct position to insert each element, making it more efficient than other sorting algorithms such as the Bubble Sort or Selection Sort. It also has a better average case time complexity of O(nlogn) compared to O(n^2) for other sorting algorithms.

When is the Binary Insertion Sort algorithm most effective?

The Binary Insertion Sort algorithm is most effective for relatively small arrays or lists, as it has a better average case time complexity compared to other sorting algorithms. It is also useful when the input array is already partially sorted, as it only requires a few comparisons to insert each element into the sorted part of the array.

Similar threads

Replies
7
Views
2K
Replies
1
Views
2K
Replies
21
Views
2K
Replies
11
Views
1K
Replies
7
Views
2K
Replies
10
Views
2K
Back
Top