Divide Binary Search Tree at Key k: Algorithm & Analysis

In summary: Insert an item into a binary search tree.*/int insertIntoSearchTree(node* root,int x){if (root == NULL)return -1;if (x<0 || x>MAX)return -2;if (!root->left || !root->right){root->left=NULL;root->right=NULL;}else{node* left=root->left;node* right=root->right;while (left>=0 && right>=0){left=left->right;right=right->left;}if (
  • #1
evinda
Gold Member
MHB
3,836
0
Hello! (Wave)

Given a binary search tree $B$, I want to write an algorithm, that divides $B$ into two new trees $B_1, B_2$, so that the first one contains all the keys of $B$ that are smaller than $k$ and the second one contains all the keys of $B$ that are greater than $k$.
Hint : Execute a search of $k$, "cutting out" the pointers that "hang" at the left side and the right side of the path. After that, merge the trees of the forest, you get.

I tried to find like that, the position, at which $k$ is:

Code:
if (B==k) p=B;
else if (B>k){
      while (B->left>x){
               B=B->left;
      }
      p=B;
else if (B<x){
      while (B->right<x){
               B=B->right;
      }
      p=B;
}

Is it right? Or have I done something wrong? (Thinking)
 
Technology news on Phys.org
  • #2
Or do we have to do it in an other way, according to the hint? (Thinking)
 
  • #3
evinda said:
Hello! (Wave)

Given a binary search tree $B$, I want to write an algorithm, that divides $B$ into two new trees $B_1, B_2$, so that the first one contains all the keys of $B$ that are smaller than $k$ and the second one contains all the keys of $B$ that are greater than $k$.
Hint : Execute a search of $k$, "cutting out" the pointers that "hang" at the left side and the right side of the path. After that, merge the trees of the forest, you get.

I tried to find like that, the position, at which $k$ is:

Code:
if (B==k) p=B;
else if (B>k){
      while (B->left>x){
               B=B->left;
      }
      p=B;
else if (B<x){
      while (B->right<x){
               B=B->right;
      }
      p=B;
}

Is it right? Or have I done something wrong? (Thinking)

Hey! (Smile)

evinda said:
Or do we have to do it in an other way, according to the hint? (Thinking)
I think you are supposed to set up a recursive algorithm.
Something like:
Code:
function algorithm(B, *B1, *B2)
  if B == NULL
    return
  extractLess(B, B->data, B1)
  extractGreater(B, B->data, B2)function extractLess(node, k, *destination)
  if node == NULL
    return
  if node->data < k
    addToTree(node->data, destination)

  extractLess(node->left, k, destination)
  extractLess(node->right, k, destination)

...
(Wasntme)
 
  • #4
I like Serena said:
Hey! (Smile)

I think you are supposed to set up a recursive algorithm.
Something like:
Code:
function algorithm(B, *B1, *B2)
  if B == NULL
    return
  extractLess(B, B->data, B1)
  extractGreater(B, B->data, B2)function extractLess(node, k, *destination)
  if node == NULL
    return
  if node->data < k
    addToTree(node->data, destination)

  extractLess(node->left, k, destination)
  extractLess(node->right, k, destination)

...
(Wasntme)

If we have, for example, this tree $B$:

View attachment 3614

and we want to divide it into two new trees $B1,B2$, so that the first one contains all the keys of $B$ that are smaller than $5$ and the second one contains all the keys of $B$ that are greater than $5$, won't we compare all the values of the nodes with $12$, instead of $5$? (Worried)
Or have I understood it wrong? (Thinking)
 

Attachments

  • treeee.png
    treeee.png
    3 KB · Views: 72
  • #5
evinda said:
If we have, for example, this tree $B$:

and we want to divide it into two new trees $B1,B2$, so that the first one contains all the keys of $B$ that are smaller than $5$ and the second one contains all the keys of $B$ that are greater than $5$, won't we compare all the values of the nodes with $12$, instead of $5$? (Worried)
Or have I understood it wrong? (Thinking)

I misread the problem statement, treating $k$ as the value of the root node, which it doesn't have to be. (Blush)
 
  • #6
I found this to be an interesting exercise. The following somewhat lengthy C program solves the problem. I recommend that you read this, even if you're not familiar with C. There are several binary search tree functions that are applicable in a lot of places.
If anyone can help me, I have noticed empirically, that the heights of the split trees are never more than the height of the original tree. I can't prove this, though.

Code:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define  MAX (1000)

typedef struct node_tag {
    int key;
    struct node_tag *left_child,*right_child;
}
node;

int insertIntoSearchTree(node* root,int x);
node* buildRandomTree(int n);
node* getNode(void);
int isSearchTree(node* root);
void splitTree(node* root,int x,node** lessRoot,node** greaterRoot);
node* appendToTree(node* root,node* p);
int countNodes(node* root);
int findMinKey(node* root);
int findMaxKey(node* root);
node* copyTree(node* root);

node* getNode(){
    node *p;
    if ((p=(node *) malloc(sizeof(node)))==NULL) {
        fprintf(stderr,"Memory allocation failure\n");
        exit(1);
    }
    p->key=0;
    p->left_child=p->right_child=NULL;
    return(p);
}

// Just frees all the nodes in the tree pointed to by root.

void destroy(node *root) {
    if (root==NULL) {
        return;
    }
    destroy(root->left_child);
    destroy(root->right_child);
    free(root);
}/* Inserts a new node (as a leaf) with key x into the NON-EMPTY binary search tree with root root.
   Return 1 if successful, but 0 if node with key x already in tree.
*/
int insertIntoSearchTree(node* root,int x) {
    node *p,*parent;
    p=root;
    while (p!=NULL && p->key!=x) {
        parent=p;
        if (x<p->key) {
            p=p->left_child;
        }
        else {
            p=p->right_child;
        }
    }
    if (p!=NULL && p->key==x) {
        return(0);
    }
    p=getNode();
    p->key=x;
    if (x<parent->key) {
        parent->left_child=p;
    }
    else {
        parent->right_child=p;
    }
    return(1);
}

/* Following builds a "random" tree with n nodes.  Found by first generating a random permutation and then
inserting the nodes in turn into an initially empty binary tree.
*/
node* buildRandomTree(int n) {
    node* t;
    int i,j,temp;
    int randomPerm[MAX];
    for (i=0;i<n;randomPerm[i]=2*i,i++);
    for (i=n-1;i>0;i--) {
        j=rand()%i;
        temp=randomPerm[i];
        randomPerm[i]=randomPerm[j];
        randomPerm[j]=temp;
    }
    t=getNode();
    t->key=randomPerm[0];
    for (i=1;i<n;i++) {
        insertIntoSearchTree(t,randomPerm[i]); // all components of randomPerm different
    }
    return(t);
}

/*  Finds the minimum key in tree root; artificially returns INT_MAX for
    an empty tree.  Convenient for comparisons made else where.
*/

int findMinKey(node* root) {
    node* p,*parent;
    if (root==NULL) {
        return(INT_MAX);
    }
    for (parent=root,p=root->left_child;p!=NULL;parent=p,p=p->left_child);
    return(parent->key);
}

// Dual of findMinKey

int findMaxKey(node* root) {
    node* p,*parent;
    if (root==NULL) {
        return(INT_MIN);
    }
    for (parent=root,p=root->right_child;p!=NULL;parent=p,p=p->right_child);
    return(parent->key);
}/* isSearchTree returns true iff the binary tree with root root is a binary search tree.
*/
int isSearchTree(node* root) {
    node* p,*parent;
    int min,max;
    if (root==NULL) {
        return(1);
    }
    int valid=isSearchTree(root->left_child) && isSearchTree(root->right_child);
    if (!valid) {
        return(0);
    }
    max=findMaxKey(root->left_child);
    min=findMinKey(root->right_child);
    return(max<root->key && root->key<min);
}

/* SPECIAL append function does NOT work for arbitrary trees.  Here the search
   tree rooted at p is appended to the search tree rooted at root with the result
   a search tree. The condition on p is that ANY node in p with key k would be inserted
   into root at the same place.  This is called only by splitTree.
*/

node* appendToTree(node* root,node* p) {
    int x;
    node *q,*parent;
    if (root==NULL) {
        return(p);
    }
    if (p==NULL) {
        return(root);
    }
    x=p->key;
    parent=NULL;
    q=root;
    while (q!=NULL) {
        parent=q;
        if (x<q->key) {
            q=q->left_child;
        }
        else {
            q=q->right_child;
        }
    }
    if (x<parent->key) {
        parent->left_child=p;
    }
    else {
        parent->right_child=p;
    }
    return(root);
}

/*  The tree at root is split into two trees lessRoot and greaterRoot with the key
    of any node in lessRoot < x and the key of any node in greaterRoot is >= x.  So if
    x is a key in the original tree root, x is the least key in greaterRoot; however,
    it is still valid if x is not a key in root.  No new nodes are allocated, so the original
    tree is invalid after the split.
*/

void splitTree(node* root,int x,node** lessRoot,node** greaterRoot) {
    node* p,*parent;
    p=root;
    while (p!=NULL && p->key!=x) {
        parent=p;
        if (x<p->key) {
            p=p->left_child;
            parent->left_child=NULL;
            *greaterRoot=appendToTree(*greaterRoot,parent);
        }
        else {
            p=p->right_child;
            parent->right_child=NULL;
            *lessRoot=appendToTree(*lessRoot,parent);
        }
    }
    if (p!=NULL) {
        *lessRoot=appendToTree(*lessRoot,p->left_child);
        p->left_child=NULL;
        *greaterRoot=appendToTree(*greaterRoot,p);
    }
}

// Just counts the nodes

int countNodes(node* root) {
    int n;
    if (root==NULL) {
        return(0);
    }
    n=countNodes(root->left_child);
    n+=countNodes(root->right_child);
    return(n+1);
}

/* Debugging aid to make certain above splitTree works. splitTree was called with pivot
   x and the origingal tree had total nodes.
*/

int checkSplitTree(int x,int total,node* lessRoot,node* greaterRoot) {
    int n,max,min;
    n=countNodes(lessRoot)+countNodes(greaterRoot);
    // first make sure no nodes got lost
    if (n!=total) {
        return(0);
    }
    // now the two split trees better be binary search trees
    if (!isSearchTree(lessRoot) || !isSearchTree(greaterRoot)) {
        return(0);
    }
    // finally, the split worked
    max=findMaxKey(lessRoot);
    min=findMinKey(greaterRoot);
    return(max<x && x<=min);
}

// Just copies the tree into a whole new tree.
node* copyTree(node* root) {
    node* p;
    if (root==NULL) {
        return(NULL);
    }
    p=getNode();
    p->key=root->key;
    p->left_child=copyTree(root->left_child);
    p->right_child=copyTree(root->right_child);
    return(p);
}

// Compute the height of the tree:
int height(node* root) {
    int h1,h2;
    if (root==NULL) {
        return(-1);
    }
    h1=height(root->left_child);
    h2=height(root->right_child);
    return((h1<=h2) ? h2+1 : h1+1);
}int main(){
    node *root,*root1;
    node *rootLess,*rootGreater;
    int i,n=800,valid;
    int ht,htLess,htGreater,h;
    //    root=buildComplete(1,n);
    root1=buildRandomTree(n);
    valid=1;
    ht=height(root1);
    htLess=htGreater=INT_MIN;
    for (i=0;i<2*n;i++) {
        root=copyTree(root1);
        rootLess=NULL;
        rootGreater=NULL;
        splitTree(root,i,&rootLess,&rootGreater);
        if (!checkSplitTree(i,n,rootLess,rootGreater)) {
            printf("Oops, something wrong! at %d\n",i);
            valid=0;
            break;
        }
        h=height(rootLess);
        if (h>htLess) {
            htLess=h;
        }
        h=height(rootGreater);
        if (h>htGreater) {
            htGreater=h;
        }
        destroy(rootLess);
        destroy(rootGreater);
    }
    if (valid) {
        printf("everything ok!\n");
        printf("original ht:%d , max of rootLess: %d, max of rootGreater: %d\n",ht,htLess,htGreater);
    }
    return(0);
}
 
  • #7
Could you describe me what we have to do? (Thinking)
 
  • #8
evinda said:
Could you describe me what we have to do? (Thinking)

The algorithm I suggested is still a good way to go.
We should just pass $k$ to it, and also pass $k$ on to the recursive calls. (Wasntme)
 
  • #9
I like Serena said:
The algorithm I suggested is still a good way to go.
We should just pass $k$ to it, and also pass $k$ on to the recursive calls. (Wasntme)

So, does the function have to be of this form? :confused:

Code:
function algorithm(B, k,B1, B2)
  if B == NULL
    return
  extractLess(B, k,B->data, B1)
  extractGreater(B, k,B->data, B2)

If so, what type of parameters should the functions [m] extractLess [/m] and [m] extractGreater [/m] have? (Thinking)
 
  • #10
As I read the original question and the hint, I assumed that the problem was to split the given tree into two new trees with no new nodes allocated. As I understand ILikeSerena's suggestion, his algorithm creates new trees with new nodes. It also appears that his suggestion is of order n (number of nodes in the original tree), whereas the solution I gave is of order the height of the original tree -- for a reasonably balanced tree, this will result in order ln(n).

Btw, here's some unsolicited advice. I think the study of algorithms is just like mathematics; it's not a spectator sport. Surely you must know some programming language (Java, C++, Python etc.). Implement your proposed solution in your favorite language. When I used to teach algorithms, I almost always expected a working program that implemented a given exercise.
 
  • #11
johng said:
As I read the original question and the hint, I assumed that the problem was to split the given tree into two new trees with no new nodes allocated. As I understand ILikeSerena's suggestion, his algorithm creates new trees with new nodes. It also appears that his suggestion is of order n (number of nodes in the original tree), whereas the solution I gave is of order the height of the original tree -- for a reasonably balanced tree, this will result in order ln(n).

Btw, here's some unsolicited advice. I think the study of algorithms is just like mathematics; it's not a spectator sport. Surely you must know some programming language (Java, C++, Python etc.). Implement your proposed solution in your favorite language. When I used to teach algorithms, I almost always expected a working program that implemented a given exercise.

I want the algorithm to be of order the height of the original tree.. (Nod)

So, in pseudocode should it be like that?

Code:
pointer appendToTree(pointer root,poiter p) {
    int x;
    pointer q,parent;
    if (root==NULL) {
        return(p);
    }
    if (p==NULL) {
        return(root);
    }
    x=p->key;
    parent=NULL;
    q=root;
    while (q!=NULL) {
        parent=q;
        if (x<q->key) {
            q=q->left_child;
        }
        else {
            q=q->right_child;
        }
    }
    if (x<parent->key) {
        parent->left_child=p;
    }
    else {
        parent->right_child=p;
    }
    return(root);
}
void splitTree(pointer root,int x,pointer lessRoot,pointer greaterRoot) {
    pointer p,*parent;
    p=root;
    while (p!=NULL && p->key!=x) {
        parent=p;
        if (x<p->key) {
            p=p->left_child;
            parent->left_child=NULL;
            greaterRoot=appendToTree(greaterRoot,parent);
        }
        else {
            p=p->right_child;
            parent->right_child=NULL;
            lessRoot=appendToTree(lessRoot,parent);
        }
    }
    if (p!=NULL) {
        lessRoot=appendToTree(lessRoot,p->left_child);
        p->left_child=NULL;
        greaterRoot=appendToTree(greaterRoot,p);
    }
}

Or is it wrong? (Thinking)

At the algorithm [m] splitTree [/m], why is [m] p [/m] a pointer, but [m]parent[/m] a pointer to a pointer? (Worried)

Can the type of a function in pseudocode be [m] pointer [/m] ? :confused:
 

FAQ: Divide Binary Search Tree at Key k: Algorithm & Analysis

What is a Binary Search Tree (BST)?

A Binary Search Tree is a data structure used in computer science for efficient storage and retrieval of data. It is a type of binary tree where each node has at most two children, and the values in the left subtree of a node are smaller than the value of the node, while the values in the right subtree are greater. This property allows for efficient searching, insertion, and deletion of data.

What does it mean to divide a Binary Search Tree at a key k?

Dividing a Binary Search Tree at a key k means splitting the tree into two separate trees based on the value of the key k. All nodes with values less than k will be in one tree, while all nodes with values greater than k will be in the other tree. This allows for quicker searching and manipulation of data within a specific range.

How does the algorithm for dividing a Binary Search Tree at key k work?

The algorithm for dividing a Binary Search Tree at key k works by recursively traversing the tree and comparing the values of each node to the key k. If the value is less than k, the node and its left subtree are added to the left tree. If the value is greater than k, the node and its right subtree are added to the right tree. This process continues until all nodes have been checked and added to their respective trees.

What is the time complexity of the algorithm for dividing a Binary Search Tree at key k?

The time complexity of the algorithm for dividing a Binary Search Tree at key k is O(n), where n is the total number of nodes in the tree. This is because the algorithm requires traversing through every node in the tree to determine which tree it belongs to. However, if the tree is balanced, the time complexity can be reduced to O(log n) as the height of a balanced BST is log n.

What are some potential applications of dividing a Binary Search Tree at key k?

Dividing a Binary Search Tree at key k can be useful in various applications such as in databases to efficiently retrieve data within a specific range, in search algorithms to narrow down the search space, and in data analysis to group data based on a particular attribute. It can also be used in image processing to extract specific features from an image, and in finance to analyze stock prices within a certain range.

Similar threads

Replies
1
Views
2K
Replies
3
Views
3K
Replies
4
Views
1K
Replies
1
Views
2K
Replies
2
Views
2K
Back
Top