CSCI 260: lab exercise 1

This lab is worth 5% of your total grade, and is due before labs/lectures on Sept. 16th.
Late assignments will be penalized 20% per day late or part thereof.

There are two parts to the exercise, part 1 is a written analysis of the complexity of different algorithms, while part 2 is an experimental analysis of three different sorting programs.

Part 1: written analysis

There are four pairs of algorithms listed below.

For each pair, provided the tightest "big-O" complexity analysis you can for both algorithms under the criteria specified and answer any additional questions listed.

Computing m-choose-n
Conduct the analysis of this pair of algorithms using m and n as the problem size, and counting multiplications and divisions as the elementary operations.

// m choose n version A
result = 1
for i = 1 to m
    result = result * i
for i = 1 to n
    result = result / i
for i = 1 to m-n
    result = result / i
return result

// m choose n version B
a = max(n, m-n)
b = min(n, m-n)
result  1
for i = 1 to b
   result = result * (a + i) / b
return result

Additional question: the intermediate values computed by the first algorithm can be much larger than those computed by the second. Identify a bound on how much larger.

Computing fibonacci numbers
Conduct the analysis of this pair of algorithms using n as the problem size, and counting recursive calls as the elementary operations.

// fib(n) version A
if (n < 0) return 0
if (n < 2) return 1
return fib(n-1) + fib(n-2)

// fib(n, v=1, p=0) version B
// (i.e. an initial call to f(n) is mapped to f(n, 1, 0))
if (n <= 0) return p 
if (n == 1) return v
return fib(n-1,v+p,v)

Computing greatest common divisors
Conduct the analysis of this pair of algorithms using a and b as the problem size, and counting modulos as the elementary operations.

// gcd(a, b) version A
s = min(a, b)
gcd = 1
for c = 2 to s
   if (a mod c) and (b mod c) are both 0
      gcd = c
return c

// gcd(a, b) version B
L = max(a,b)
s = min(a, b)
while (s > 0)
   t = s
   s = L mod s
   L = s
return L

Generating primes
Conduct the analysis of this pair of algorithms using n as the problem size, and counting modulos as the elementary operations.

// genprimes(n, arr[])  (generate all primes <= n, store in arr)
numprimes = 0
for c = 2 to n
    isprime = true
    for f = 2 to c
        if (c mod f) is 0
           isprime = false
    if isprime
        arr[numprimes] = c
        numprimes++
return numprimes

// genprimes(n, arr[])  (generate all primes <= n, store in arr)
arr[0] = 2
arr[1] = 3
numprimes = 2
c = 5
while c <= n
   isprime = true
   i = 0
   f = arr[i]
   while (f*f < c) and isprime
      if (c mod f) is 0
         isprime = false
      else
         i++
         f = arr[i]
   if isprime
      arr[numprimes] = c
      numprimes++
   c += 2
return numprimes

Possibly useful factoid: the number of primes less than n approaches n/ln(n) as n approaches infinity.

Part 2: experimental analysis

The file sort.o contains implementations of three sorting functions: sort1, sort2, sort3. (Prototypes for the functions can be found in sort.h)

The program labex1.C lets the user pick and array size, makes fills 3 copies of an array of randomly generated values, then calls and times each of the three sorting routines on that array. A sample run of the program is shown below
501> labex1 Enter 0 to use random data or 1 to use sorted data or anything else to use reverse-sorted data 0 Please specify the number of values to sort: 50000 Do you wish to time sort1? (y or n) y Approximate CPU time used by sort1 20.67 seconds Do you wish to time sort2? (y or n) y Approximate CPU time used by sort2 0.02 seconds Do you wish to time sort3? (y or n) y Approximate CPU time used by sort3 0.01 seconds 502>

The three sorting algorithms are shown in the table below, sortA, sortB, sortC.
// sort A: bubblesort(arr, size) for (int p = 1; p < size; p++) { bool sorted = true for (int i = 1; i < size; i++) { if (arr[i] < arr[i-1]) { swap(arr[i], arr[i-1]) sorted = false } } if (sorted) break }
// sort B: combsort(arr, size) long gap = size bool swapped = false while ((gap > 1) || swapped) { if (gap > 1) gap = (4*gap)/5 swapped = false for (int i = 0; (i+gap) < size; i++) { if (arr[i] > arr[i+gap]) { swap(arr[i], arr[i+gap]) swapped = true } } }
// sort C: quicksort(arr, 0, size-1) quicksort(arr[], lower, upper) if (upper > lower) { int mid = partition(arr, lower, upper) quicksort(arr, lower, mid-1) quicksort(arr, mid+1, upper) } int partition(arr[], int lower, int upper) if (upper <= lower) return lower int pos, midpt long pivotVal = arr[lower] midpt = lower swap(arr[lower], arr[upper]) for (pos = lower; pos < upper; pos++) { if (arr[pos] < pivotVal) { swap(arr[pos], arr[midpt]) midpt++ } } swap(arr[midpt], arr[upper]) return midpt

Your task for the assignment is:

Make a table of the running times of the three sorting functions on a variety of different array sizes and data arrangements (sorted, reverse-sorted, and random).
Based on the data, estimate the growth rate (O(???)) for each of the three functions for each of the three data arrangements, and use that to decide which of the three functions are best/worst in each of the three scenarios (sorted, reversed, random) for large data sets. Clearly justify your methods and decisions.
Decide which of the the three sort functions (sort1, sort2, sort3) implements which of the three algorithms (sortA, sortB, sortC) and, again, clearly justify your decisions.

To compile and run the code:

Copy the three files to your account: sort.o, sort.h, and labex1.C
Compile using the command
g++ sort.o labex1.C -o labex1
Supply the array size each time you run labex1, e.g.
./labex1 2000

Submission

Turn in the written assignment, including all the analyses from part 1 and for part 2 be sure to include your table of recorded times/array sizes/data arrangements as well as your decisions and justifications.