CSCI 260 Fall 2010: Notes

CSCI 260 Fall 2010: Graph searches and traversals

As with any of our data structures, one of our most common operations is to explore the entire structure - either attempting to visit every node or seeking one specific node.

In the case of graphs, there are two common types of search or traversal: depth-first, and breadth-first.

We then move on to all nodes at distance 2 (i.e. all those that can be reached by following two edges from the start node).

We continue this process until every node in the graph has been visited at least once.

To do so, we recursively explore all the edges leading out from the current node before we move on to an adjacent vertex. (In breadth-first we processed all the adjacent vertices before moving on to their descendants.)

For example, suppose we have the following graph

      A ------ B
      | \
      |  \
      |   \
      C---D----E

If we are starting from node A, then in a depth-first traversal we might:

explore C and its desendants
- explore D and its descendants
  - explore E and its descendants
Explore B and its descendants

If we are starting from node A, then in a breadth-first traversal we might:

explore nodes reachable from A, i.e.
- explore B
- explore C
- explore D
explore nodes reachable from the nodes we've already seen, i.e. those which can be reached from B, C, or D:
- explore E

For both search types, we want to avoid searching the same nodes repeatedly, but we can run into them along a variety of paths. To avoid problems, we'll keep track of whether or not we've seen a particular node and whether or not we've processed a particular node (e.g. put "visited" and "processed" flags in the node's data structure).

Depth first search with a stack

One algorithm for conducting a depth-first search is as follows

initialize all nodes' flags to false
maintain a stack of nodes that we're still exploring
pick the start node (e.g. A above) and set its visited flag to true and push it onto the stack
while the stack is not empty, do:
- take the top node out of the stack
- do any processing we want with the node (e.g. print the contents) and set its processed flag
- for each node adjacent to it:
  if its visited field is false then set its visited field to true and push it onto the stack

Note this assumes we have a stack ADT that allows us to maintain a stack of graph nodes.

Breadth first search with a queue

One algorithm for conducting a breadth-first search is as follows

initialize all nodes' flags to false
maintain a queue of nodes that we're still exploring
pick the start node (e.g. A above) and set its visited flag to true and put it into the queue
while the queue is not empty, do:
- take the front node out of the queue
- do any processing we want with the node (e.g. print the contents) and set its processed flag to true
- for each node adjacent to it:
  if its visited field is false then set its visited field to true and put it into the queue

Note this assumes we have a queue ADT that allows us to maintain a queue of graph nodes.

Implementation notes

We can use the C++ standard template library for our stacks and queues (rather than building our own from scratch).

For a double-ended queue, include the deque library, which comes with method for push_back(), pop_front(), and size().

For a stack include the stack library, which comes with methods for push(), pop(), and size().

Since we'll want the stack or queue contents to be node pointers, the actual declarations look like:
stack<node*> varname;
deque<node*> varname;

The algorithms above also assume that the entire graph is connected, which is not necessarily the case. For example, suppose the graph looks like this:

     A               B
    / \              |
   C---D             E

One solution is to take the algorithms above and embed them in a do-while loop, that keeps processing until every node in the graph has been marked as "processed".

An example implementation for each is shown below:

/************** IMPLEMENTATION EXAMPLE *********************/

// the contents of an individual node in the graph
// struct node {
//    int nodeid;      // the node's position in Graph[] 
//    string nodename; // text data for the application
//    float nodedata;  // numeric data for the application
//    bool visited, processed; // flags used for graph traversals
// };

// call print_node on each node in the graph,
//    using a depth-first ordering
void depth_first()
{
   // set all nodes' visited and processed flags to false
   for (int i = 0; i < VERTICES; i++) {
       if (Graph[i] != NULL) {
          Graph[i]->visited = false;
          Graph[i]->processed = false;
       }
   }

   // maintain a stack of nodes to be processed
   stack s;

   // id will track the starting node for seperate
   //    graph components, in case the graph isn't
   //    fully connected
   int id = 0;
   int component = 1;

   do { // find and process the next graph component, if any

      // find the next vertex that hasn't been processed
      while ((id < VERTICES) && (Graph[id] == NULL)) id++;
      if (id >= VERTICES) return;
      if (Graph[id]->processed == true) { id++; continue; }

      // otherwise push our new starting point for exploration
      //    (after setting its visited field to true)
      cout << endl << "Processing graph component " << (component++) << endl;
      Graph[id]->visited = true;
      s.push(Graph[id]);
   
      // while the stack isn't empty
      //    take the next node off the stack
      //       print it and set its processed flag
      //    for each adjacent node
      //       if it hasn't been visited yet
      //          set its visited flag to true
      //          and push it on the stack
      while (s.size() > 0) {
         node *n = s.top();
         s.pop();
         if (!n) continue;
         int nid = n->nodeid;
         print_node(nid);
         n->processed = true;
         for (int j = 0; j < VERTICES; j++) {
             if ((Edges[nid][j] != 0) && (Graph[j] != NULL)) {
                if (Graph[j]->visited == false) {
                    Graph[j]->visited = true;
                    s.push(Graph[j]);
                }
             }
         }
      }
      
   } while (id < VERTICES);
}

// call print_node on each node in the graph,
//    using a breadth-first ordering
void breadth_first()
{
   // set all nodes' visited and processed flags to false
   for (int i = 0; i < VERTICES; i++) {
       if (Graph[i] != NULL) {
          Graph[i]->visited = false;
          Graph[i]->processed = false;
       }
   }

   // maintain a queue of nodes to be processed
   deque q;

   // id will track the starting node for seperate
   //    graph components, in case the graph isn't
   //    fully connected
   int id = 0;
   int component = 1;

   do { // find and process the next graph component, if any

      // find the next vertex that hasn't been processed
      while ((id < VERTICES) && (Graph[id] == NULL)) id++;
      if (id >= VERTICES) return;
      if (Graph[id]->processed == true) { id++; continue; }

      // otherwise push our new starting point for exploration
      //    (after setting its visited field to true)
      cout << endl << "Processing graph component " << (component++) << endl;
      Graph[id]->visited = true;
      q.push_back(Graph[id]);
   
      // while the queue isn't empty
      //    take the next node out of the queue
      //       print it and set its processed flag
      //    for each adjacent node
      //       if it hasn't been visited yet
      //          set its visited flag to true
      //          and put it into the queue
      while (q.size() > 0) {
         node *n = q.front();
         q.pop_front();
         if (!n) continue;
         int nid = n->nodeid;
         print_node(nid);
         Graph[nid]->processed = true;
         for (int j = 0; j < VERTICES; j++) {
             if ((Edges[nid][j] != 0) && (Graph[j] != NULL)) {
                if (Graph[j]->visited == false) {
                    Graph[j]->visited = true;
                    q.push_back(Graph[j]);
                }
             }
         }
      }
   } while (id < VERTICES);
}