In fact, as we have seen, the tree structures can potentially wind up more like linked lists than a tree - in which case the efficiency for our common operations deteriorates to O(N) for a tree of N nodes.
The solution is to keep our trees well structured as we build them. The typical approach is to check (and fix) the structure of the tree after each insert operation and each remove operation.
As we will see, it is not necessary to keep the tree perfectly structured - for our purposes it will suffice to keep a tree "nearly" balanced.
We will define a tree to be sufficiently balanced if, for every node in the tree, its left subtree and right subtree differ in height by at most 1.
We will define an avl tree to be a binary search tree which conforms to that balance rule, i.e. for every node in an avl tree:
Whenever we change the structure of the tree through an insert or remove, we will update the heights of any affected nodes, and call routines to check the tree's balance and rearrange nodes as necessary to restore its balance.
For example, in the tree below the newly inserted node is marked with a * and the height of each of the nodes marked with a + would have to be increased by 1 during the insertion process, while the height of the . nodes would remain unchanged:
+ / \ . + / / \ . . + / *
We can very easily update the height of a given node in the tree if we know that the heights of its subtrees are correct as follows:
if n's left subtree is taller than n's right subtree then n's height is one plus the height of its left subtree otherwise n's height is one plus the height of its right subtreeIf the tree is unbalanced, then we know that one of the subtrees is at least two levels taller than the other.
If that's the case, then we could reduce the problem by cutting down the height of the taller side by one and increasing the height of the shorter side by one.
This is achieved by rotating elements of the taller subtree into the shorter subtree. Thus we may need to rotate to the left, or rotate to the right - depending on which of the two subtrees is the taller one.
Consider the two examples below, illustrating simple rotations to the left and right:
Rotation to the left: move the right child (Y) "up" and to the left, while moving the root of the subtree (N) "down" and to the left note that if Y had a left child it must be transfered over to become N's right child, but this still maintains the valid binary search tree properties BEFORE AFTER N Y / \ / \ / \ / \ X Y N D / \ / \ / \ A B C D X C / \ A B Rotation to the right: move the left child (X) "up" and to the right, while moving the root of the subtree (N) "down" and to the right note that if X had a right child it must be transfered over to become N's left child, but this still maintains the valid binary search tree properties BEFORE AFTER N X / \ / \ / \ / \ X Y A N / \ / \ / \ A B C D B Y / \ C DOur policy after an insertion or removal will be to work from the bottom of the tree (the point of change) upwards - solving our balance problems as close to the source as possible.
Prior to an insert/remove, we know the heights of subtree pairs differ by at most one throughout the tree, so after a single insertion or removal we know the heights differ by at most two.
If the tree has become unbalanced, it is possible that a single rotation will completely solve our balance problem, but there is a case when a second rotation would be necessary:
Before rotating After rotating to the right to the right 46 32 / \ / \ 32 64 6 46 / \ / \ 6 40 40 64 / / 38 38In such a situation we can solve the problem by performing two rotations instead of one:
Rotate to the left in the left subtree 46 46 / \ / \ 32 64 40 64 / \ / 6 40 32 / / \ 38 6 38
then rotate to the right through the root 46 40 / \ / \ 40 64 32 46 / / \ \ 32 6 38 64 / \ 6 38
Here we'll try to examing just how good (or bad) we can expect our avl tree structures to be.
Let Th be the smallest number of nodes we could have in a valid avl tree of height h.
Note that the this happens when one of the two subtrees is shorter than the other, giving Th = 1 + Th-1 + Th-2
Let's define an extra variable, Fh = Th + 1 Then:
Thus we can note that Fh+2 = Fh+1 + Fh, and if we observe that F2 = 5 and F3 = 8 then this precisely defines the Fibonacci sequence!
Thus the smallest number of nodes in a valid AVL tree of height h is given by the hth Fibonacci number minus 1.
Fortunately, Binet, Euler et al have worked out a formula for the hth Fibonacci number:
(1+50.5)h - (1 - 50.5)h ------------------------- 2(5)0.5The optimal height for a tree of Th nodes is log2(Th), and our actual height is h, so if we take h and divide it by log2(Fib(h) - 1) then we have a ration of our worst case AVL tree to the optimal binary search tree!
In fact, this boils down to a ratio of approximately 1.44, i.e. our AVL trees are (at worst) of height approximately 1.44 lg(N).
#include <string> #include <iostream> using namespace std; class avltree { private: // each node keeps track of it's left and right child, // it's key value, and it's height // (the number of nodes beneath it on the longest // path to a leaf) struct node { node *right, *left; string key; int height; }; // maintain a pointer to the root of the tree node *root; // private, recursive routines (used by the public methods) // search n's subtree for the top node with key matching k, node *search(string k, node *n); // delete all nodes in n's subtree void deallocate(node* &n); // create and insert a node with the specified key/data // values within the subtree rooted at n bool insert(string k, string d, node* &n); // in n's subtree, remove the topmost node whose // key matches k bool remove(string k, node* &n); // print, in sorted order, all the key values // in n's subtree void print(node *n); // print the pointer structure for the subtree of n // (each node's key & the keys of the nodes it points at) void debugprint(node *n); // rotate to the left through node n void rotate2left(node* &n); // rotate to the right through node n void rotate2right(node* &n); // check if n's subtree is unbalanced, and // perform any necessary rotations to fix it void nodecheck(node* &n); // find the node with the smallest key in n's subtree node *findsmallest(node *n); // update the height field of node n, assuming // its children's fields are up to date void updateheight(node *n); public: // create an empty tree avltree() { root = NULL; } // deallocate the tree ~avltree() { deallocate(root); } // display the keys in the tree (sorted) void display() { print(root); } // display the tree pointer structure void debug() { debugprint(root); } // create and insert a new node in the tree bool insert(string k, string d) { if (insert(k, d, root)) nodecheck(root); else return false; return true; } // remove the topmost node with the specified key bool remove(string k) { if (remove(k, root)) nodecheck(root); else return false; return true; } // determine if the tree contains any nodes with // the specified key bool search(string k) { node *n = search(k, root); if (!n) return false; return true; } }; void avltree::updateheight(node *n) // compute the height of node n, assuming the heights // of n's left and right children are correct // n's height is one greater than the height of the // taller of its two children { // make sure n isn't null if (!n) return; // remember one or both of n's children might be null if ((!n->left) && (!n->right)) { n->height = 0; } else if (!n->left) { n->height = n->right->height + 1; } else if (!n->right) { n->height = n->left->height + 1; } // general case: both children exist, else { n->height = n->left->height + 1; if (n->height <= n->right->height) n->height = n->right->height + 1; } } avltree::node *avltree::search(string k, node *n) // search the subtree rooted at n, // looking for the topmost node whose key matches k // if a match is found return a pointer to the node, // otherwise return null { if (!n) return NULL; if (n->key == k) return n; else if (n->key > k) return search(k, n->left); else return search(k, n->right); } void avltree::deallocate(node* &n) // delete all nodes in the subtree rooted at n, // and set n to null { if (!n) return; deallocate(n->left); deallocate(n->right); delete n; n = NULL; } avltree::node *avltree::findsmallest(node *n) // in the subtree rooted at n, // find the node with the smallest key value // (i.e. go as far left as possible) { if (!n) return NULL; while (n->left) n = n->left; return n; } bool avltree::insert(string k, node* &n) // insert a new node in the binary search tree rooted at n, // returning true if successful, false otherwise // // after a successful insertion below n, // nodecheck is called to determine if the // subtree rooted at n is unbalanced, // and to perform any reconstruction necessary { // if we've found the end of a chain, // insert the node here if (!n) { n = new node; if (!n) return false; n->key = k; n->left = NULL; n->right = NULL; n->height = 0; return true; } // call the insert routine recursively on either // the left or right subtree, // checking for and performing rotations if it // was successful if (n->key > k) { if (insert(k, d, n->left)) { nodecheck(n->left); return true; } } else { if (insert(k, d, n->right)) { nodecheck(n->right); return true; } } // if we get here then the recursive insert // was unsuccessful return false; } bool avltree::remove(string k, node* &n) // if the subtree rooted at n contains a node whose key // matches k then remove it from the subtree, // then check for any necessary reconstruction of the tree // return true if an element is successfully removed, // or false otherwise { // if n is an empty tree then give up if (!n) return false; // if the matching node must be somewhere in the left subtree // then make a recursive call and check for any needed rotation if (n->key > k) { if (remove(k, n->left)) { nodecheck(n->left); return true; } } // if the matching node must be somewhere in the right subtree // then make a recursive call and check for any needed rotation else if (n->key < k) { if (remove(k, n->right)) { nodecheck(n->right); return true; } } // if the current node is the one that must be removed, // base the handling on how many children the node has else { // remember which node we'll actually delete node *victim = n; // if the node has no children we can simply delete it if ((!n->left) && (!n->right)) { delete n; n = NULL; return true; } // if the node has just a right child then we can // bypass n (i.e. make the pointer to n point // to its right child instead) else if (!n->left) { n = n->right; delete victim; return true; } // if the node has just a left child then we can // bypass n (i.e. make the pointer to n point // to its left child instead) else if (!n->right) { n = n->left; delete victim; return true; } // if the node has two children then we'll replace the // node with the smallest node from the right subtree // (basically copying the other node's key value // over top of n's) // then make a recursive call to remove the duplicate // element from the right subtree, // remembering to check for necessary rotation // once we've altered n's subtrees else { victim = findsmallest(n->right); if (!victim) return false; string vkey = victim->key; if (!remove(victim->key, n->right)) return false; n->key = vkey; nodecheck(n); return true; } } return false; } void avltree::print(node *n) // display the key contents of the subtree rooted at n, // sorted (ascending) by key value { if (!n) return; print(n->left); cout << n->key << endl; print(n->right); } void avltree::debugprint(node *n) // display the contents and structure of the subtree rooted at n, // performed via preorder traversal { if (!n) return; cout << n->key << " ("; if (n->left) cout << n->left->key; else cout << "NULL"; cout << "<-left,right->"; if (n->right) cout << n->right->key; else cout << "NULL"; cout << ")(height:" << n->height << ")" << endl; debugprint(n->left); debugprint(n->right); } void avltree::rotate2left(node* &n) // rotates n's right child up, and n down to the left // BEFORE AFTER // N Y // / \ / \ // / \ / \ // X Y N D // / \ / \ / \ // A B C D X C // / \ // A B { node *tmp = n; // remember N n = n->right; // make Y the root of the subtree tmp->right = n->right; // make C into N's right child n->left = tmp; // make N into Y's left child updateheight(tmp); // N's height has probably changed updateheight(n); // Y's height has probably changed } void avltree::rotate2right(node* &n) // rotates n's left child up, and n down to the right // BEFORE AFTER // N X // / \ / \ // / \ / \ // X Y A N // / \ / \ / \ // A B C D B Y // / \ // C D { node *tmp = n; // remember N n = n->left; // make X the root of the subtree tmp->left = n->left; // make B into N's left child n->right = tmp; // make N into X's right child updateheight(tmp); // N's height has probably changed updateheight(n); // X's height has probably changed } void avltree::nodecheck(node* &n) // determine if the subtree rooted at n has become unbalanced // (i.e. the height difference between the left and right // subtrees of n is more than 1) // and perform any rotations necessary to reconstruct the tree // in a balanced form. { // quit if n is an empty tree if (!n) return; // update the height and balance fields for n updateheight(n); // store the height of the left and right subtrees int leftheight = 0; int rightheight = 0; if (n->left) leftheight = n->left->height; if (n->right) rightheight = n->right->height; // quit if n is balanced (i.e. if the left and right // subtree heights are within 1 of each other) if ((leftheight <= (rightheight+1)) && (leftheight >= (rightheight-1))) return; // handle the cases where the right subtree is taller if (rightheight > leftheight) { int Rright = 0; int Rleft = 0; if (n->right->left) Rleft = n->right->left->height; if (n->right->right) Rright = n->right->right->height; // if Rleft is taller than Rright then we need an // extra rotation if (Rleft > Rright) rotate2right(n->right); // either way we need to rotate to the left through n rotate2left(n); } // handle the cases where the left subtree is taller else { int Lright = 0; int Lleft = 0; if (n->left->left) Lleft = n->left->left->height; if (n->left->right) Lright = n->left->right->height; // if Lright is taller than Lleft then we need an // extra rotation if (Lleft < Lright) rotate2left(n->left); // either way we need to rotate to the right through n rotate2right(n); } }