Monday September 23. 11:59 pm
NO EXTENSIONS.
You are to design and build a simple database system in Java. Your system must support databases with a variety of schemes. We will make several simplifying assumptions to keep the project managable:
You should begin by creating a Java class (ADT) to represent a
relation. Your relation
class should have two
constructors. The first will create an empty relation with a given
number of columns. The second will take a second argument that
specifies a file name from which initial data for the relation should
be read. You may assume that tuples in the file are separated by
newline characters and that fields (attributes) within a tuple are
separated by tabs. You should deal gracefully with invalid file
contents (e.g. a line with too few or too many attributes), printing
an appropriate error message and producing a well-defined result.
Internally, you may represent a relation as an array of Java strings.
In addition to the constructors, your ADT should provide the following methods:
Several of these methods will be easier to implement if your
relation
class supports the standard
Enumeration
or
Iterator
interface.
For the purposes of delete and lookup methods, a predicate is a triple consisting of
<=
,
<
,
>=
,
>
,
==
,
!=
) — to be performed lexicographically on strings
To allow a user to manipulate the database, you will need a command language and a way to refer to relations. You should maintain, internally, a mapping from single-character relation names (lower case letters will suffice) to the relation (if any) named by that character. This mapping can be a simple array. Your system should then read a sequence of commands from standard input, one command per line. Valid commands are as follows:
c a 4 filenameThis command creates a new relation a with 4 fields, and reads the data for the relation from the (optional) file specified.
a b "foo" "3" "hi, mom"This command adds a tuple into the existing relation b. The number of additional arguments after the b must equal the number of attributes in relation b.
o aThis command will print relation a created above.
p a 2 3 4 bThis command will project relation a onto 2 fields (fields numbers 3 and 4 in relation a) producing relation b.
j a b cThis command joins relations a and b on the final field of a and the initial field of b, producing relation c. If a has
n
fields and b has m
, c will have
n+m-1
.
s a 1 == 2 bThis command selects those tuples from relation a where field 1 == field 2, and produces a new relation b containing those tuples.
s a 1 != "foo" bThis command selects those tuples from relation a where field 1 != "foo", and produces a new relation b containing those tuples.
The operators allowed in the select command can be any of the comparison operators supported by your implementation of predicates.
d a 1 == 3This command deletes those tuples in relation a where field 1 == field 3.
As with the select command, the second operand of the predicate may be a string, and all of the comparison operations must be supported.
u a b cThis command creates relation c, giving it all tuples found in a or b. Relations a and b must have the same number of fields.
i a b cThis command creates relation c, giving it all tuples found in both a and b. Relations a and b must have the same number of fields.
x aThis command deletes relation a, making it unavailable for future operations. You should be certain that the Java garbage collector will remove the data (in other words, remove all references to this Relation).
Your database need not necessarily be super efficient (but see the extra credit suggestions below). You may, if you wish, store the tuples of a relation in a list, and peruse the entire list when necessary to implement relational operations.
In the interests of compact maintainable code, you are encouraged to make maximum use of the Java standard libraries, including the collections framework.
You should of course detect and handle invalid commands. You should print a helpful message, ignore the command, and continue.
If you are not familiar with the Java StringTokenizer class, take a look. It might be useful for parsing your command line or other strings.
You are encouraged to work on the CSUG systems (especially exploiting remote access) to become familiar with the environment.
That said, you may work on any platform you like, but (1) your
final code must compile and run correctly using the
javac
compiler and java
virtual machine on
the csug
Linux machines, and (2) you must hand in your
work using the appropriate turnin
script on those machines
(watch the newsgroup for details).
javac
is Sun's version 1.4.0 compiler for Linux.
The jikes
compiler from IBM is also available for the
adventurous, but again, your code must compile and run with
javac. (Note to the adventurous: to use jikes
you
must set your JIKESPATH environment variable to
".:/usr/staff/lib/java/latest/jre/lib/rt.jar"
.)
Extensive documentation for javac
is available on-line
in HTML. On the csug
machines, the root file is
/usr/staff/lib/java/latest/docs/index.html.
The TAs will shortly be creating test data that you can use to exercise your code. Watch the newsgroup.
They will also be posting instructions on
turnin
script
For extra credit, you might consider the following options:
If you do any extra credit, remember to document it. If you do not, you will likely not get credit for the work. You should also provide test data, and a description of how to use the test data to show that your extensions do what you claim.