CSC 173, Fall 2002

Assignment 4: Parsing


New due date!!!! Friday, Nov 15, 11:59pm.

Continuing in the vein of Assignment 3, you are to build a more sophisticated program to format Java programs as HTML. As before, keywords should appear in bold face black, literals (numbers, character and string constants, and the words true, false, and null) should appear in bold face green, and comments should appear in black italics. In addition

Initial declarations of identifiers include class names, local variable names, formal parameter names (including parameters to catch blocks), field names, method names, and constructor names. For example:

      static class foo { ...
  
      const int pi = 3.1416;
  
      my_type my_method(int a, my_class b) ...
  

Note that identifiers should remain normal face black in all contexts other than their definition. In the example above, my_type and my_class are not printed in red.

Indentation and spacing rules are described in more detail on a SEPARATE PAGE.

I am providing source code for a significant fraction of the project; your task is to extend this source to produce a working program. (For those who prefer, the source code is available as a gzip-ed tar file.) This source code includes a solution to Assignment 3. Rather than focus on the scanner, however, you will spend most of your time in file parser.c, which contains a recursive descent parser for Java.

As in Assignment 3, I have simplified the code somewhat by "covering" the language--accepting constructs that are not really valid Java--and you may continue in this vein. Your program should work correctly when given valid Java as input.

On companion pages you can find the LL(1) grammar on which the current code is based, together with the full grammar you are to implement. In both cases you can also find the FIRST sets for all non-terminals in the grammar. These should help you to understand the case labels in the current code, and to create the labels in the new code you will write:

The official grammar from the Java reference manual is also available on-line if you feel the need to double-check something. Note that this grammar is designed for "bottom-up" parsing in the style of yacc and bison; it won't work directly for recursive descent.

Note that you are required to retain the recursive structure of the parser; ad-hoc approaches are not allowed for this assignment.

DUE DATE:

NO EXTENSIONS.

Extra Credit Suggestions

For extra credit (to be used for grade adjustment at the end of the semester); you might consider the following options: