Rollout: The SpamBGon Suite
The second and final stage of the project is the rollout of the
completed project. The deliverables for this stage are:
- BSFTrain.java and BSFTest.java
- The two
primary programs of the SpamBGon suite.
- Other Java source files
- Any other supporting code files
necessary to compile, load, and use the BSFTrain and
BSFTest programs. Note: if these programs depend on
external library code other than the Java JDK or the
gnu.getopt suite, the submission tarball MUST either include
the library whole or provide easy and explicit instructions on how and
where to access such libraries. This documentation MUST be provided
in the README.TXT file. The designer is responsible for
ensuring that all copyright and distribution conditions are adhered
to.
- README.TXT
- This file MUST describe how to compile,
configure, and install the SpamBGon suite. It MUST also list
any dependencies on additional software support libraries.
- Internal documentation
- The handin MUST also include the full,
compiled JavaDoc documentation for all Java source files in the
submission tarball. This documentation MUST include full descriptions
of every public or protected method, field, sub-class, enclosed class,
or constructor employed by the code. This documentation hierarchy
MUST be included in a sub-directory named documentation/
within the submission tarball package.
- User documentation
- The handin submission MUST include complete
user-level documentation for the SpamBGon suite. This
documentation MUST include instructions on how to use both
BSFTrain and BSFTest including the functionality of
all command-line options. The documentation MUST also describe the
function and use of any additional programs included in the
submission. User documentation MUST include information on the
expected inputs and outputs of all programs, how to read and interpret
the output, and information on all status and error messages that the
programs could produce. This documentation MUST also include at least
one example of how to run each program and how to interpret the
output. This document MUST be named USERDOC.extension, but
it MAY be be a plain text, HTML, PDF, or PostScript document (with the
appropriate extension). It MUST NOT be a Microsoft Word or
other nonportable format document.
- Performance documentation
- The handin submission MUST include a
document describing the performance of the SpamBGon suite,
including its ability to differentiate SPAM from NORMAL email under
different amounts of TRAINING data and under different tokenizers
(including a small range of reasonable parameters for each
parameterized tokenizer). This document MUST also include the
designer's assessment of which tokenizer is superior and why or, if
different tokenizers are superior under different conditions, what
conditions are important to the success of each. The designer MAY
choose any tests that she or he desires to establish the performance
of her/his SpamBGon suite, but MUST describe all tests and
why they lead to the stated conclusions about performance. Finally,
this document MUST include the designer's assessment of how to improve
the performance of the system (e.g., what other kind of tokenizer
might be helpful, how to change the probability equations to improve
accuracy, etc.) This document MUST be named
PERFORMANCE.extension, but it MAY be a plain text, HTML, PDF,
or PostScript document (with the appropriate extension). It
MUST NOT be a Microsoft Word or other nonportable format document.
- Test cases
- The submission tarball MUST include a subdirectory
named tests/ that includes all of the test data used to
demonstrate the performance of the SpamBGon suite.
- CVS log file(s)
- For each Java source file, the submission
tarball MUST include a corresponding .log file including the
CVS log for that sourcecode.
At the programmer's option, this submission MAY also include:
- BUGS.TXT
- This file documents any known outstanding
bugs, missing features, peformance problems, or failures to meet
specifications of your submission. Note that the penalty for such
problems will be smaller if they're fully documented here than if the
instructors discover them independently.
Note that if the MondoHashTable code is not fully functional
for Milestone 1, a revised version MAY be submitted in this handin.
If MondoHashTable has been revised for this version, this
submission tarball MUST include the necessary supporting documentation
described under Milestone 1, as well as notes describing the added
functionality/improvements between Milestone 1 and this handin.
Terran Lane
2004-01-26