Sorting by access frequency
The motivation for this form of list organization is that most searchable data sets contain some items that are accessed frequently, and many items that are accessed rarely.
If we move the frequently-accessed items to the front of the list then the most common searches will occur quickly, and only the "rare" searches will take a long time.
For this method, we include a counter with each data item to track how frequently it has been accessed. Every time the item is accessed we increment the counter and move it forward in the list, passing any items which have been accessed less frequently.
In this method, new items (which have thus never been accessed) are inserted at the back of the list.
Note that the search operation is still worst case O(N).
Sorting by most-recently-accessed
The motivation for this form of list organization is that the items that have been most recently accessed are the ones that are currently being used, and are most likely to be accessed again in the near future.
For this method, each time an item is accessed it is immediately moved to the front of the list.
New items are treated as being "of current interest", and thus items are inserted at the front of the list.
This technique performs the worst if elements are actually called for in a round-robin fashion, since the item is accessed just when it has reached the back of the list.
Note that the search for the node is worst case O(N), and the swap operation to move the node to the front of the list is O(1).
Caches
Another widely-used technique to speed up searches is to maintain two seperate lists:
When a request comes in we check the small list first. If we don't find the item there then we do a search of the regular list, putting the found item into the cache (possibly displacing something that used to be in the cache).
This works well if we have an effective way for picking what belongs in the cache.
The most common caching techniques are:
Which technique we decide to use will be based on what we know (or expect) about the way users will access the data in our lists - is there a group of elements which are accessed far more than the rest (in which case choose most frequently used), or do accesses to elements tend to occur in clusters (in which case choose most recently used).
// the solist (self organizing list) class allows us to organize lists // in two different methods // (1) bycount: list elements are sorted based on the number of times // the element has been the target of a successful search // (2) tofront: each time a list element is found by a search // that element is advanced to the front of the list class solist { private: // list nodes track the nodes before and after them, // the number of times they have been accessed, // and their key/data fields struct sonode { sonode *next; sonode *prev; string key; string data; int numrefs; } *front, *back; // pointers to the ends of the list public: // users can specify which of the two organization methods the lists follow enum updatetype { bycount, tofront }; // base constructor and destructor solist(); ~solist(); // inserts are done at the front of the list (for tofront method) // or the back of the list (for bycount method) bool insert(string k, string d, updatetype t = bycount); // updatenode rearranges the list after a node has been accessed void updatenode(sonode *n, updatetype t); // search finds the first matching string, copies the data field, // and (if successful) calls the update routine bool search(string k, string &d, updatetype t = tofront); // remove finds, removes, and deletes the first matching string // (after copying the data field) bool remove(string k, string &d); // display shows the entire list contents, in order bool display(); }; solist::solist() { // initialize an empty list front = back = NULL; } solist::~solist() { // deallocate each list node in turn sonode *current = front; while (current) { sonode *victim = current; current = current->next; delete victim; } } bool solist::insert(string k, string d, updatetype t) { // reject duplicate entries if (search(k, d)) return false; // create the new node, quit if unable to allocate sonode *current = new sonode; if (!current) return false; // initialize the new node current->next = current->prev = NULL; current->key = k; current->data = d; // if the list is empty then make this the sole element if (!front) front = back = current; // if using the organize-by-count method // insert the new node at the back of the list else if (t == bycount) { current->prev = back; back->next = current; back = current; } // otherwise insert the new node at the back of the list else { current->next = front; front->prev = current; front = current; } // insertion successful return true; } bool solist::search(string k, string &d, updatetype t) { // search for the first matching node sonode *current = front; while (current) { if (current->key == k) { d = current->data; // reorganize the list if a match was found updatenode(current, t); return true; } current = current->next; } return false; } bool solist::remove(string k, string &d) { // search for the first matching node sonode *current = front; while (current) { if (current->key == k) { // find the nodes around the victim sonode *prev = current->prev; sonode *next = current->next; // if this is the only list element // just make the list empty if ((!next) && (!prev)) front = back = NULL; // reset the neighbours' pointers to bypass the victim if (next) next->prev = prev; if (prev) prev->next = next; // extract the data and deallocate the victim d = current->data; delete current; return true; } current = current->next; } return false; } bool solist::display() { // display each node's key and data in turn sonode *current = front; while (current) { cout << current->key << ":" << current->data << " (numrefs: "; cout << current->numrefs << ")" << endl; current = current->next; } return true; } void solist::updatenode(sonode *n, updatetype t) { // if the node is empty bail out if (!n) return; // update the node's reference count n->numrefs++; // if the node is already at the front // then don't bother trying to move it if (front == n) return; // keep track of the node's neighbours // in case we need to move it sonode *prev = n->prev; sonode *next = n->next; // if the list is organized by reference count then advance // the node to bypass everything with a smaller count if (t == bycount) { sonode *target = n; // find out how far n can advance while ((target->prev) && (target->prev->numrefs <= n->numrefs)) target = target->prev; // if n is already as far forward as it needs to be then quit if (target == n) return; // chop n out of its current position if (prev) prev->next = next; else front = next; if (next) next->prev = prev; else back = prev; // put n in front of target prev = target->prev; if (prev) prev->next = n; else front = n; n->next = target; n->prev = prev; target->prev = n; } // otherwise swap the new node to the front of the list else { prev->next = next; if (next) next->prev = prev; else back = prev; n->prev = NULL; n->next = front; front->prev = n; front = n; } }