In a heap, the largest value is always in the root of the tree, and every node contains the largest value in its own subtree.
Furthermore, the nodes in the tree are always filled from the top down, and within a level they are filled from left to right. Thus the tree is always as full as possible.
For example, the following are all valid heaps:
9 17 36 40 / / \ / \ / \ 8 12 4 9 20 40 40 / \ 10 20Note that smaller values can go into either subtree, and that duplicates are allowed.
The following are some examples of invalid heaps:
8 32 4 6 / \ / \ / / \ 6 9 6 10 3 4 5 / \ / / / \ 1 7 2 1 2 3
Heaps are not terribly effective for searching data, since we don't which subtree to search for a specific value (just that the values in both subtrees will be less than or equal to the current node).
However, heaps are very effective for sorting a specific collection of data, or for accessing data in sorted order.
The typical (public) heap operations are insert (which puts a new value into the heap, while maintaining a valid heap structure) and remove (which removes the root element from the heap - i.e. the largest value in the heap - and then rearranges the heap to maintain its structure).
For instance:
Of course, the trick is to efficiently maintain a valid heap structure when performing inserts and removes.
When inserting:
21 21 21 27 / \ / \ / \ / \ 18 20 => 18 20 => 18 27 => 18 21 / \ / \ / / \ / / \ / 9 10 9 10 27 9 10 20 9 10 20
When removing:
For example, suppose we do a remove from the following tree:
32 9 16 16 / \ / \ / \ / \ 16 10 => 16 10 => 9 10 => 11 10 / \ / / / 11 9 11 11 9Since each of these operations just traverses one path down the tree, and since the tree is as compact as possible, we can be sure that our insert and remove operations are always O(lg(N)) for a heap of N nodes.
Array-based implementations
Because the tree is maintained as compactly as possible, and because we always fill the tree levels left-to-right, we can make effective use of an array-based implementation for heaps (as long as we know of an acceptable upper bound on the maximum size of the heap).
This implementation assumes a size for the heap can be determined when we create the heap (e.g. we specify that the heap can hold up to N elements).
The data is then stored in a dynamically-allocated array, and we keep track of both the amount of space allocated for the array and the number of elements currently stored in the heap.
Since the heap is a form of binary tree, we can use the following rules for accessing parents/children:
Pointer-based implementations
In the labs you'll be implementing a pointer-based heap, rather than the array-based implementation discussed here. Aside from the usual pointer-based tree issues (setting up the node structs, using private recursive methods, etc) there is one important issue to consider: how to efficiently find the next/last free spot in the tree.
In the array version, if the heap currently has N items in it then we know the last one is in array position N-1 and the next available space is in array position N.
In a pointer-based version, we have to find the correct pointer:
Fortunately, if we know the current size of the heap we can compute the chain of nodes we need to follow to find the correct insertion/removal point.
Let's assume that in our heap class we add a currentsize
data field:
initializing it to 0 in the constructor and incrementing or decrementing it as
necessary when doing inserts and removes.
If we want to insert then we need to compute the path to position currentsize
,
and if we want to remove then we need to compute the path to position currentsize-1
.
In essence, our algorithm will work backwards from the target node:
void showpath(int N) { while (N > 0) { // if N is even then it's a right child, // otherwise it's a left child if ((N % 2) == 1) cout << N << " is the left child of node "; else cout << N << " is the right child of node "; // compute which node is N's parent, // and move up to that level N = (N - 1) / 2; cout << N << endl; } } |
A full example of an array-based heap implementation is given below.
#include <iostream> #include <string> using namespace std; // if the user doesn't specify how large a heap they want // we'll use this value const int DEFAULT_HEAPSIZE = 1024; class heap { private: // store the data, // the number of items the heap can hold, // and the number of items it currently holds string *hp; int maxsize; int cursize; // helper function to maintain heap properties after // an insert operation bool moveup(int pos); // helper function to maintain heap properties after // a remove operation bool movedown(int pos); public: // heap constructor and destructor heap(int sz = DEFAULT_HEAPSIZE); ~heap() { delete hp; } // insert one element or many bool insert(string s); int insert(string src[], int sz); // remove one element (the largest) or many bool remove(string &s); int remove(string dest[], int num); // print the elements, top-down/left-to-right void print(); // look up the number of items currently in the heap int getsize() { return cursize; } }; // attempt to insert num elements into the heap // returns the number of elements successfully inserted int heap::insert(string src[], int sz) { int count = 0; for (int i = 0; i < sz; i++) { if (insert(src[i])) count++; } return count; } // attempt to remove num elements from the heap // (the next largest element each time) // each successful remove stores the element in // the passed dest array in the next available position // returns the number of elements successfully removed int heap::remove(string dest[], int num) { int count = 0; for (int i = 0; i < num; i++) { string s; if (remove(s)) dest[count++] = s; } return count; } // print the heap contents, // top-down, left-to-right void heap::print() { // error checking if ((!hp) || (maxsize < 1)) return; // go through the heap top-down, left-to-right, // i.e. just step through the array! for (int i = 0; i < cursize; i++) { cout << "(" << i << "): "; cout << hp[i] << endl; } } // allocate a heap of the specified size // (if the size is invalid, or there is insufficient memory // then set the maximum heap size to 0) heap::heap(int sz) { // initially there are no elements stored in the heap cursize = 0; // try to allocate a heap of the specified size if (sz > 0) hp = new string[sz]; // if the allocation doesn't work then treat it // as a heap that can't hold anything! else hp = NULL; if (!hp) maxsize = 0; else maxsize = sz; } // if there is sufficient space, // insert string s into the next heap position // then call moveup to restore the heap properties // return true if successful, // false otherwise bool heap::insert(string s) { // error checking if ((cursize >= maxsize) || (!hp)) return false; // put the new element in the next available heap position // and increase the count of heap elements by one hp[cursize++] = s; // call moveup to push the new value // as far up the heap as it needs to go // to restore the heap properties if (!moveup(cursize - 1)) { cout << "ERROR: moveup failed!" << endl; cursize--; return false; } else return true; } // if the heap isn't empty, // copy the root value into parameter s // move the "last" heap element into the root // and call movedown to restore the heap properties // return true if successful, // false otherwise bool heap::remove(string &s) { // error checking if ((cursize < 1) || (!hp)) return false; // copy the (largest) value out of the root s = hp[0]; // move the "last" value in the heap into the root hp[0] = hp[cursize-1]; // remove the last value from the heap cursize--; // call movedown to push the moved value // as far down the heap as it needs to go // to restore the heap properties if (!movedown(0)) { cout << "ERROR: movedown failed!" << endl; cursize++; return false; } else return true; } // while the value in the heap at the specified position // is greater than the value of its parent, // swap the two of them // return true if successful, // false otherwise bool heap::moveup(int pos) { // error checking if ((pos < 0) || (pos >= cursize) || (!hp)) return false; // keep pushing a value up the heap until // either we reach the root position (0) // or we hit larger values while (pos > 0) { // compute the position of the parent of pos int parent = (pos - 1) / 2; // if the current value is no larger than // its parent's value then we're done if (hp[pos] <= hp[parent]) return true; // otherwise we need to swap the two values // and continue up from the parent else { string tmp = hp[pos]; hp[pos] = hp[parent]; hp[parent] = tmp; pos = parent; } } // we hit position 0 return true; } // while the value in the heap at the specified position // is less than the value of one of its children, // swap it with the larger of its two children // return true if successful, // false otherwise bool heap::movedown(int pos) { // error checking if ((pos < 0) || (pos >= cursize) || (!hp)) return false; // keep going down the heap until we've hit a leaf // or the value has reached a valid position while (pos < cursize) { // compute the position of the left and right children int left = 2 * pos + 1; int right = 2 * pos + 2; // if we find a child with a larger value than // in the current heap position, // we'll store its string in target // and its position in replacement string target = hp[pos]; int replacement = -1; // check to see if the left child (if there is one) // contains a larger value than the one in the // current position in the heap if ((left < cursize) && (hp[left] > hp[pos])) { target = hp[left]; replacement = left; } // check to see if the right child (if there is one) // has an even larger value if ((right < cursize) && (hp[right] > target)) { target = hp[right]; replacement = right; } // if we found a replacement for the current node // then swap them and continue moving down // otherwise we're done if (replacement >= 0) { string tmp = hp[pos]; hp[pos] = hp[replacement]; hp[replacement] = tmp; pos = replacement; } else return true; } // this should never be reached return true; }