In fact, as we have seen, the tree structures can potentially wind up more like linked lists than a tree - in which case the efficiency for our common operations deteriorates to O(N) for a tree of N nodes.
The solution is to keep our trees well structured as we build them. The typical approach is to check (and fix) the structure of the tree after each insert operation and each remove operation.
As we will see, it is not necessary to keep the tree perfectly structured - for our purposes it will suffice to keep a tree "nearly" balanced.
We will define a tree to be sufficiently balanced if, for every node in the tree, its left subtree and right subtree differ in height by at most 1.
We will define an avl tree to be a binary search tree which conforms to that balance rule, i.e. for every node in an avl tree:
Whenever we change the structure of the tree through an insert or remove, we will update the heights of any affected nodes, and call routines to check the tree's balance and rearrange nodes as necessary to restore its balance.
For example, in the tree below the newly inserted node is marked with a * and the height of each of the nodes marked with a + would have to be increased by 1 during the insertion process, while the height of the . nodes would remain unchanged:
+
/ \
. +
/ / \
. . +
/
*
We can very easily update the height of a given node in the tree if we know that the heights of its subtrees are correct as follows:
if n's left subtree is taller than n's right subtree
then n's height is one plus the height of its left subtree
otherwise
n's height is one plus the height of its right subtree
If the tree is unbalanced, then we know that one of the subtrees is
at least two levels taller than the other.
If that's the case, then we could reduce the problem by cutting down the height of the taller side by one and increasing the height of the shorter side by one.
This is achieved by rotating elements of the taller subtree into the shorter subtree. Thus we may need to rotate to the left, or rotate to the right - depending on which of the two subtrees is the taller one.
Consider the two examples below, illustrating simple rotations to the left and right:
Rotation to the left:
move the right child (Y) "up" and to the left,
while moving the root of the subtree (N) "down" and to the left
note that if Y had a left child it
must be transfered over to become N's right child, but
this still maintains the valid binary search tree properties
BEFORE AFTER
N Y
/ \ / \
/ \ / \
X Y N D
/ \ / \ / \
A B C D X C
/ \
A B
Rotation to the right:
move the left child (X) "up" and to the right,
while moving the root of the subtree (N) "down" and to the right
note that if X had a right child it
must be transfered over to become N's left child, but
this still maintains the valid binary search tree properties
BEFORE AFTER
N X
/ \ / \
/ \ / \
X Y A N
/ \ / \ / \
A B C D B Y
/ \
C D
Our policy after an insertion or removal will be to work from the
bottom of the tree (the point of change) upwards - solving our balance
problems as close to the source as possible.
Prior to an insert/remove, we know the heights of subtree pairs differ by at most one throughout the tree, so after a single insertion or removal we know the heights differ by at most two.
If the tree has become unbalanced, it is possible that a single rotation will completely solve our balance problem, but there is a case when a second rotation would be necessary:
Before rotating After rotating
to the right to the right
46 32
/ \ / \
32 64 6 46
/ \ / \
6 40 40 64
/ /
38 38
In such a situation we can solve the problem by performing two
rotations instead of one:
Rotate to the left in the left subtree
46 46
/ \ / \
32 64 40 64
/ \ /
6 40 32
/ / \
38 6 38
then rotate to the right through the root
46 40
/ \ / \
40 64 32 46
/ / \ \
32 6 38 64
/ \
6 38
Here we'll try to examing just how good (or bad) we can expect our avl tree structures to be.
Let Th be the smallest number of nodes we could have in a valid avl tree of height h.
Note that the this happens when one of the two subtrees is shorter than the other, giving Th = 1 + Th-1 + Th-2
Let's define an extra variable, Fh = Th + 1 Then:
Thus we can note that Fh+2 = Fh+1 + Fh, and if we observe that F2 = 5 and F3 = 8 then this precisely defines the Fibonacci sequence!
Thus the smallest number of nodes in a valid AVL tree of height h is given by the hth Fibonacci number minus 1.
Fortunately, Binet, Euler et al have worked out a formula for the hth Fibonacci number:
(1+50.5)h - (1 - 50.5)h
-------------------------
2(5)0.5
The optimal height for a tree of Th nodes is
log2(Th), and our actual height is h,
so if we take h and divide it by log2(Fib(h) - 1)
then we have a ration of our worst case AVL tree to the optimal
binary search tree!
In fact, this boils down to a ratio of approximately 1.44, i.e. our AVL trees are (at worst) of height approximately 1.44 lg(N).
#include <string>
#include <iostream>
using namespace std;
class avltree {
private:
// each node keeps track of it's left and right child,
// it's key value, and it's height
// (the number of nodes beneath it on the longest
// path to a leaf)
struct node {
node *right, *left;
string key;
int height;
};
// maintain a pointer to the root of the tree
node *root;
// private, recursive routines (used by the public methods)
// search n's subtree for the top node with key matching k,
node *search(string k, node *n);
// delete all nodes in n's subtree
void deallocate(node* &n);
// create and insert a node with the specified key/data
// values within the subtree rooted at n
bool insert(string k, string d, node* &n);
// in n's subtree, remove the topmost node whose
// key matches k
bool remove(string k, node* &n);
// print, in sorted order, all the key values
// in n's subtree
void print(node *n);
// print the pointer structure for the subtree of n
// (each node's key & the keys of the nodes it points at)
void debugprint(node *n);
// rotate to the left through node n
void rotate2left(node* &n);
// rotate to the right through node n
void rotate2right(node* &n);
// check if n's subtree is unbalanced, and
// perform any necessary rotations to fix it
void nodecheck(node* &n);
// find the node with the smallest key in n's subtree
node *findsmallest(node *n);
// update the height field of node n, assuming
// its children's fields are up to date
void updateheight(node *n);
public:
// create an empty tree
avltree() { root = NULL; }
// deallocate the tree
~avltree() { deallocate(root); }
// display the keys in the tree (sorted)
void display() { print(root); }
// display the tree pointer structure
void debug() { debugprint(root); }
// create and insert a new node in the tree
bool insert(string k, string d) {
if (insert(k, d, root)) nodecheck(root);
else return false;
return true;
}
// remove the topmost node with the specified key
bool remove(string k) {
if (remove(k, root)) nodecheck(root);
else return false;
return true;
}
// determine if the tree contains any nodes with
// the specified key
bool search(string k) {
node *n = search(k, root);
if (!n) return false;
return true;
}
};
void avltree::updateheight(node *n)
// compute the height of node n, assuming the heights
// of n's left and right children are correct
// n's height is one greater than the height of the
// taller of its two children
{
// make sure n isn't null
if (!n) return;
// remember one or both of n's children might be null
if ((!n->left) && (!n->right)) {
n->height = 0;
} else if (!n->left) {
n->height = n->right->height + 1;
} else if (!n->right) {
n->height = n->left->height + 1;
}
// general case: both children exist,
else {
n->height = n->left->height + 1;
if (n->height <= n->right->height)
n->height = n->right->height + 1;
}
}
avltree::node *avltree::search(string k, node *n)
// search the subtree rooted at n,
// looking for the topmost node whose key matches k
// if a match is found return a pointer to the node,
// otherwise return null
{
if (!n) return NULL;
if (n->key == k) return n;
else if (n->key > k) return search(k, n->left);
else return search(k, n->right);
}
void avltree::deallocate(node* &n)
// delete all nodes in the subtree rooted at n,
// and set n to null
{
if (!n) return;
deallocate(n->left);
deallocate(n->right);
delete n;
n = NULL;
}
avltree::node *avltree::findsmallest(node *n)
// in the subtree rooted at n,
// find the node with the smallest key value
// (i.e. go as far left as possible)
{
if (!n) return NULL;
while (n->left) n = n->left;
return n;
}
bool avltree::insert(string k, node* &n)
// insert a new node in the binary search tree rooted at n,
// returning true if successful, false otherwise
//
// after a successful insertion below n,
// nodecheck is called to determine if the
// subtree rooted at n is unbalanced,
// and to perform any reconstruction necessary
{
// if we've found the end of a chain,
// insert the node here
if (!n) {
n = new node;
if (!n) return false;
n->key = k;
n->left = NULL;
n->right = NULL;
n->height = 0;
return true;
}
// call the insert routine recursively on either
// the left or right subtree,
// checking for and performing rotations if it
// was successful
if (n->key > k) {
if (insert(k, d, n->left)) {
nodecheck(n->left);
return true;
}
} else {
if (insert(k, d, n->right)) {
nodecheck(n->right);
return true;
}
}
// if we get here then the recursive insert
// was unsuccessful
return false;
}
bool avltree::remove(string k, node* &n)
// if the subtree rooted at n contains a node whose key
// matches k then remove it from the subtree,
// then check for any necessary reconstruction of the tree
// return true if an element is successfully removed,
// or false otherwise
{
// if n is an empty tree then give up
if (!n) return false;
// if the matching node must be somewhere in the left subtree
// then make a recursive call and check for any needed rotation
if (n->key > k) {
if (remove(k, n->left)) {
nodecheck(n->left);
return true;
}
}
// if the matching node must be somewhere in the right subtree
// then make a recursive call and check for any needed rotation
else if (n->key < k) {
if (remove(k, n->right)) {
nodecheck(n->right);
return true;
}
}
// if the current node is the one that must be removed,
// base the handling on how many children the node has
else {
// remember which node we'll actually delete
node *victim = n;
// if the node has no children we can simply delete it
if ((!n->left) && (!n->right)) {
delete n;
n = NULL;
return true;
}
// if the node has just a right child then we can
// bypass n (i.e. make the pointer to n point
// to its right child instead)
else if (!n->left) {
n = n->right;
delete victim;
return true;
}
// if the node has just a left child then we can
// bypass n (i.e. make the pointer to n point
// to its left child instead)
else if (!n->right) {
n = n->left;
delete victim;
return true;
}
// if the node has two children then we'll replace the
// node with the smallest node from the right subtree
// (basically copying the other node's key value
// over top of n's)
// then make a recursive call to remove the duplicate
// element from the right subtree,
// remembering to check for necessary rotation
// once we've altered n's subtrees
else {
victim = findsmallest(n->right);
if (!victim) return false;
string vkey = victim->key;
if (!remove(victim->key, n->right)) return false;
n->key = vkey;
nodecheck(n);
return true;
}
}
return false;
}
void avltree::print(node *n)
// display the key contents of the subtree rooted at n,
// sorted (ascending) by key value
{
if (!n) return;
print(n->left);
cout << n->key << endl;
print(n->right);
}
void avltree::debugprint(node *n)
// display the contents and structure of the subtree rooted at n,
// performed via preorder traversal
{
if (!n) return;
cout << n->key << " (";
if (n->left) cout << n->left->key;
else cout << "NULL";
cout << "<-left,right->";
if (n->right) cout << n->right->key;
else cout << "NULL";
cout << ")(height:" << n->height << ")" << endl;
debugprint(n->left);
debugprint(n->right);
}
void avltree::rotate2left(node* &n)
// rotates n's right child up, and n down to the left
// BEFORE AFTER
// N Y
// / \ / \
// / \ / \
// X Y N D
// / \ / \ / \
// A B C D X C
// / \
// A B
{
node *tmp = n; // remember N
n = n->right; // make Y the root of the subtree
tmp->right = n->right; // make C into N's right child
n->left = tmp; // make N into Y's left child
updateheight(tmp); // N's height has probably changed
updateheight(n); // Y's height has probably changed
}
void avltree::rotate2right(node* &n)
// rotates n's left child up, and n down to the right
// BEFORE AFTER
// N X
// / \ / \
// / \ / \
// X Y A N
// / \ / \ / \
// A B C D B Y
// / \
// C D
{
node *tmp = n; // remember N
n = n->left; // make X the root of the subtree
tmp->left = n->left; // make B into N's left child
n->right = tmp; // make N into X's right child
updateheight(tmp); // N's height has probably changed
updateheight(n); // X's height has probably changed
}
void avltree::nodecheck(node* &n)
// determine if the subtree rooted at n has become unbalanced
// (i.e. the height difference between the left and right
// subtrees of n is more than 1)
// and perform any rotations necessary to reconstruct the tree
// in a balanced form.
{
// quit if n is an empty tree
if (!n) return;
// update the height and balance fields for n
updateheight(n);
// store the height of the left and right subtrees
int leftheight = 0;
int rightheight = 0;
if (n->left) leftheight = n->left->height;
if (n->right) rightheight = n->right->height;
// quit if n is balanced (i.e. if the left and right
// subtree heights are within 1 of each other)
if ((leftheight <= (rightheight+1)) &&
(leftheight >= (rightheight-1))) return;
// handle the cases where the right subtree is taller
if (rightheight > leftheight) {
int Rright = 0;
int Rleft = 0;
if (n->right->left) Rleft = n->right->left->height;
if (n->right->right) Rright = n->right->right->height;
// if Rleft is taller than Rright then we need an
// extra rotation
if (Rleft > Rright) rotate2right(n->right);
// either way we need to rotate to the left through n
rotate2left(n);
}
// handle the cases where the left subtree is taller
else {
int Lright = 0;
int Lleft = 0;
if (n->left->left) Lleft = n->left->left->height;
if (n->left->right) Lright = n->left->right->height;
// if Lright is taller than Lleft then we need an
// extra rotation
if (Lleft < Lright) rotate2left(n->left);
// either way we need to rotate to the right through n
rotate2right(n);
}
}