CSC 265 -- Exercise 2

DUE: midnight on Thursday 10 October 1996


PROBLEM SUMMARY

We have developed a Perl program, html2ascii.pl, that translates an HTML file to ASCII. Only some of the HTML tags are supported. Your task is to enhance html2ascii.pl to handle several more tags.

PROBLEM SPECIFICATION

  1. Copy the directory tree
    	~csc265/Exercises/Ex2
    
    This directory contains the source code for the existing translator, in four Perl files, three test files and the desired outputs when those test files are processed by the Perl program.
  2. Test the existing translator to see that it works.
    	html2ascii.pl < test0 > test0.act
    	cmp test0.act test0.exp
    
    There should be no differences between test0.act and test0.exp.
  3. The Perl code supports all of the HTML tags specified below except for <ul>, </ul>, <br>, and <hr>. Modify the code to support these three tags.

    You must change only the file TagRoutines.pl and you must change that file only as necessary.

HTML TAG SPECIFICATIONS

<blockquote>
outputs two linefeeds and increases indent level by one.
</blockquote>
outputs two linefeeds and decreases indent level by one.
<ol>
outputs two linefeeds, increases indent level by one, and sets label to 1.
</ol>
outputs two linefeeds, decreases indent level by one.
<ul>
outputs two linefeeds, increases indent level by one, and sets label to '*'.
</ul>
like </ol>
<li>
outputs one linefeed, reduces indent by one level, outputs the label, and increases the indent by one. If the label (as set by the <ol> or <ul> tag above) is numeric, the label followed by a period followed by at least one space is printed. Otherwise, if the label is non-numeric, that label followed by at least one space is printed. The number of of spaces is chosen so that the number of characters printed by <li> is the fewest consistent with the requirements just stated but cannot be less than the amount of indentation generated for each indent level. Finally, the label is incremented if it is numeric.
<h1>, </h1>, <h2>, </h2> ... </h6>
outputs two linefeeds.
<title>, </title>
outputs two linefeeds.
<br>, <p>
outputs one linefeed.
<pre>
outputs one linefeed, copies subsequent lines with no indentation and no adjustments to line lengths until the </pre> tag is seen.
</pre>
outputs one linefeed.
<hr>
outputs one linefeed, outputs an unindented full line of dashes, and then one linefeed.

GRADING CRITERIA

Your mark will be based on the following three criteria:
  1. Does the modified program work correctly? At the minimum, the output from test0, test1, and test2 must match test0.exp, test1.exp, and test2.exp, respectively.
  2. Are your changes to the program elegant and minimal?
  3. Have you followed the instructions for submitting the assignment?


SUBMISSION

Your solution must be sent in a single e-mail message to csc265@gulf. When you are ready to submit, carefully follow the instructions below:
  1. Make sure that the TagRoutines.pl file correctly identifies your team and its members in a comment beginning on the second line of the file.
  2. Change your working directory to the Ex2 directory -- which holds your new version of the TagRoutines.pl file.
  3. Execute the command:
    	mail -s 'Exercise 2 submission' csc265 < TagRoutines.pl
    
    This command will e-mail the file to user csc265.
Do not send multiple messages. All e-mail submissions after the first one will be ignored.