CSCI 260: lab exercise 1

This lab is worth 5% of your total grade, and is due before labs/lectures on Sept. 16th.
Late assignments will be penalized 20% per day late or part thereof.

There are two parts to the exercise, part 1 is a written analysis of the complexity of different algorithms, while part 2 is an experimental analysis of three different sorting programs.


Part 1: written analysis

There are four pairs of algorithms listed below.

For each pair, provided the tightest "big-O" complexity analysis you can for both algorithms under the criteria specified and answer any additional questions listed.

  1. Computing m-choose-n
    Conduct the analysis of this pair of algorithms using m and n as the problem size, and counting multiplications and divisions as the elementary operations.
    // m choose n version A
    result = 1
    for i = 1 to m
        result = result * i
    for i = 1 to n
        result = result / i
    for i = 1 to m-n
        result = result / i
    return result
    
    // m choose n version B
    a = max(n, m-n)
    b = min(n, m-n)
    result  1
    for i = 1 to b
       result = result * (a + i) / b
    return result
    

    Additional question: the intermediate values computed by the first algorithm can be much larger than those computed by the second. Identify a bound on how much larger.

  2. Computing fibonacci numbers
    Conduct the analysis of this pair of algorithms using n as the problem size, and counting recursive calls as the elementary operations.
    // fib(n) version A
    if (n < 0) return 0
    if (n < 2) return 1
    return fib(n-1) + fib(n-2)
    
    // fib(n, v=1, p=0) version B
    // (i.e. an initial call to f(n) is mapped to f(n, 1, 0))
    if (n <= 0) return p 
    if (n == 1) return v
    return fib(n-1,v+p,v)
    

  3. Computing greatest common divisors
    Conduct the analysis of this pair of algorithms using a and b as the problem size, and counting modulos as the elementary operations.
    // gcd(a, b) version A
    s = min(a, b)
    gcd = 1
    for c = 2 to s
       if (a mod c) and (b mod c) are both 0
          gcd = c
    return c
    
    // gcd(a, b) version B
    L = max(a,b)
    s = min(a, b)
    while (s > 0)
       t = s
       s = L mod s
       L = s
    return L
    

  4. Generating primes
    Conduct the analysis of this pair of algorithms using n as the problem size, and counting modulos as the elementary operations.
    // genprimes(n, arr[])  (generate all primes <= n, store in arr)
    numprimes = 0
    for c = 2 to n
        isprime = true
        for f = 2 to c
            if (c mod f) is 0
               isprime = false
        if isprime
            arr[numprimes] = c
            numprimes++
    return numprimes
    
    // genprimes(n, arr[])  (generate all primes <= n, store in arr)
    arr[0] = 2
    arr[1] = 3
    numprimes = 2
    c = 5
    while c <= n
       isprime = true
       i = 0
       f = arr[i]
       while (f*f < c) and isprime
          if (c mod f) is 0
             isprime = false
          else
             i++
             f = arr[i]
       if isprime
          arr[numprimes] = c
          numprimes++
       c += 2
    return numprimes
    
    Possibly useful factoid: the number of primes less than n approaches n/ln(n) as n approaches infinity.


Part 2: experimental analysis

The file sort.o contains implementations of three sorting functions: sort1, sort2, sort3. (Prototypes for the functions can be found in sort.h)

The program labex1.C lets the user pick and array size, makes fills 3 copies of an array of randomly generated values, then calls and times each of the three sorting routines on that array. A sample run of the program is shown below
501> labex1
Enter 0 to use random data
   or 1 to use sorted data
or anything else to use reverse-sorted data
0
Please specify the number of values to sort: 50000
Do you wish to time sort1? (y or n)
y
Approximate CPU time used by sort1 20.67 seconds
Do you wish to time sort2? (y or n)
y
Approximate CPU time used by sort2 0.02 seconds
Do you wish to time sort3? (y or n)
y
Approximate CPU time used by sort3 0.01 seconds
502> 

The three sorting algorithms are shown in the table below, sortA, sortB, sortC.
// sort A: bubblesort(arr, size) 
for (int p = 1; p < size; p++) {
    bool sorted = true
    for (int i = 1; i < size; i++) {
        if (arr[i] < arr[i-1]) {
           swap(arr[i], arr[i-1])
           sorted = false
        }
    }
    if (sorted) break
}
// sort B: combsort(arr, size)
long gap = size
bool swapped = false
while ((gap > 1) || swapped) {
   if (gap > 1) gap = (4*gap)/5
   swapped = false
   for (int i = 0; (i+gap) < size; i++) {
       if (arr[i] > arr[i+gap]) {
          swap(arr[i], arr[i+gap])
          swapped = true
       }
   }
}
// sort C: quicksort(arr, 0, size-1)
quicksort(arr[], lower, upper)
   if (upper > lower) {
      int mid = partition(arr, lower, upper)
      quicksort(arr, lower, mid-1)
      quicksort(arr, mid+1, upper)
   }

int partition(arr[], int lower, int upper)
   if (upper <= lower) return lower
   int pos, midpt
   long pivotVal = arr[lower]
   midpt = lower
   swap(arr[lower], arr[upper])
   for (pos = lower; pos < upper; pos++) {
       if (arr[pos] < pivotVal) {
          swap(arr[pos], arr[midpt])
          midpt++
       }
   }
   swap(arr[midpt], arr[upper])
   return midpt

Your task for the assignment is:

  1. Make a table of the running times of the three sorting functions on a variety of different array sizes and data arrangements (sorted, reverse-sorted, and random).

    Based on the data, estimate the growth rate (O(???)) for each of the three functions for each of the three data arrangements, and use that to decide which of the three functions are best/worst in each of the three scenarios (sorted, reversed, random) for large data sets. Clearly justify your methods and decisions.

  2. Decide which of the the three sort functions (sort1, sort2, sort3) implements which of the three algorithms (sortA, sortB, sortC) and, again, clearly justify your decisions.

To compile and run the code:
  • Copy the three files to your account: sort.o, sort.h, and labex1.C
  • Compile using the command
    g++ sort.o labex1.C -o labex1
  • Supply the array size each time you run labex1, e.g.
    ./labex1 2000


Submission

Turn in the written assignment, including all the analyses from part 1 and for part 2 be sure to include your table of recorded times/array sizes/data arrangements as well as your decisions and justifications.