yarrgh token types, comments, whitespace, file extensions
By convention, yarrgh files generally use the .yar extension.
Comments begin with a # and continue to the end of the line (newline: not Windows
compatible), and must be stripped out by the tokenizer
Extra whitespace between tokens is ignored.
The supported token types are:
- keywords: program open close def integer real set read write repeat until if else
- (infix) math operators: + - * / ^ // %
- (infix) comparison operators: = != < <= > >=
- parentheses: ( )
- the semicolon: ;
- integer literals: one or more digits
- real literals: one or more digits then optionally a decimal point and another one or more digits
(e.g. valid numeric literals include things like 99, 00123.4500, or 1.234, but you cannot have a number that begins
or ends with a decimal point)
- identifiers: start with an alphabetic character (upper or lowercase) and can be followed by any number of
by any number of digits and/or upper or lowercase alphabetic characters (e.g. aAB933eAx2 would be
a valid identifier)
yarrgh grammar rules
For lab4, a yarrgh program has the following form:
- a program: consists of the keyword program followed by a block
- a block: consists of the keyword open followed by one or more statements followed by the keyword close
- a statement: can be a variable declaration, an input statement, an output statement, an assignment
statement, or a repeat loop
- variable declarations : consist of the keyword def followed by an identifier then
a type specifier (keywords integer or real) then a semicolon
- input statements : consist of the keyword read followed by an identifier followed by a semicolon
- output statements : consist of the keyword write followed by an expression followed by a semicolon
- assignment statements : consist of the keyword set followed by an identifier
followed by an expression followed by a semicolon
- repeat loops : has the form:
repeat block until condition semicolon
where the condition has the form
- conditions: have the form
( expression comparisonoperator expression )
- if/else statements: has the form:
if condition block else block
- comparison operators: are any of the following
- math operators: are any of the following
+ - * / // % ^
where // is integer division, % is modulo, and ^ is exponent/power
math operators can be parenthesized, e.g. (x + y) * (10 + a),
and follow the usual precedence and associativity rules
- expressions: use the infix math operators (above), applied to
variables, literals, and (sub)expressions
Declaration, type, and scope rules
Note that we won't be carrying out type and declaration checking in lab4, but they
will be a part of lab5.
- Variables must be declared before they can be used.
- All variables are locally scoped, and accessible anywhere after their point
of declaration within the current block.
- Integer and real values can be compared, and integers can freely be assigned
to reals, but warnings are generated when a real is assigned to an integer
(This will include passing a real to an integer parameter once we introduce functions
in lab5.)
|