Quantitative Requirements

This section describes the performance and behavior requirements for the Moogle software suite.

  1. All of the elements of the MSpider Safety Requirements (Section 4.2) are included here, by reference.
  2. All programs MUST NOT crash, core dump, dump a stack trace, or throw an exception on any input.
  3. In the case of a RECOVERABLE ERROR, a program MUST issue a warning statement and continue processing. The program MAY choose to issue the warning statement to standard error, to a log file, or to a user interface element. If the warning is issued to a log file, the log file name and location MUST be a user-specifiable parameter to the program (either by command-line command or via a configuration menu).
  4. In the case of an UNRECOVERABLE ERROR, a program MUST issue an error statement and terminate with a non-zero error condition. The program MAY use different exit codes to indicate different error conditions, but such codes MUST be documented in the user manual. The error message MUST be logged to the same destination that warning messages (from RECOVERABLE ERRORS) are.
  5. In the case of any ERROR, a program MUST NOT delete, corrupt, or damage existing web database files or any other ``stateful'' files employed by the program suite.
  6. The MondoHashMap.java module MUST NOT use or reference the Hashtable, HashMap, AbstractMap, HashSet, TreeSet, or any of their subclasses.
  7. For (substantially) reduced credit, the Moogle suite MAY use the HashMap class in place of MondoHashMap. Note that this requirement exists only as an aid in case the designer has difficulty getting MondoHashMap to work properly; for full credit the entire Moogle suite MUST employ MondoHashMap and MUST NOT employ or refer to any of the classes listed in the previous bullet point.
  8. The entire program suite MUST NOT employ or refer to the StreamTokenizer class or any element of java.util.regex, including indirect references to it via, e.g., String.split().
  9. The programs MAY provide additional output for debugging purposes, but such output must be disabled by default. Any program MAY provide a command-line switch or a user-interface configuration utility to enable debugging support when desired.
  10. The Moogle suite MAY use the gnu.getopt.Getopt and gnu.getopt.LongOpt classes to assist in handling command-line options.
  11. The designer MAY ask permission of the instructor or the TA to use any classes outside the JDK that have not already been mentioned. The final programs MUST NOT use any class outside the JDK that have not been explicitly allowed.
  12. The Moogle suite MAY assume that all valid user input is standard ASCII text in the range (char)0-(char)127, inclusive. If a program encounters a character outside this range, it MAY treat it it as a RECOVERABLE or UNRECOVERABLE ERROR or silently ignore it. If such characters are treated as RECOVERABLE or ignored, they MUST NOT disrupt the otherwise normal functioning of the program.
  13. All programs MUST NOT assume that all user input is validly structured search statements. If a program encounters syntactically erroneous input (e.g., punctation, multiple ``AND'' or ``OR''s in sequence) it MAY produce a RECOVERABLE or UNRECOVERABLE ERROR, but it MUST NOT crash, corrupt the web database file, etc.
  14. All programs MUST NOT assume that all web references point to syntactically valid HTML pages. The designer MAY use the tools in java.net and java.swing.text.html to help determine which pages are both HTML and are syntactically valid. All non-HTML content MAY be ignored by Moogle, though the designer MAY choose to handle it in some reasonable way. All syntactically invalid HTML content MAY also be silently ignored, though the designer MAY choose to attempt to recover from a failed parse and continue to analyze this page.

  15. The MSpider engine MUST require no more than amortized $ O(n)$ time to analyze a single web page of size $ n$ WORDs. It MUST also require no more than $ O(k)$ HTTP queries to retrieve a web hierarchy of $ k$ documents.
  16. The Moogle client MUST run in $ O(m\cdot n)$ time for a query of $ m$ words each of which is associated with $ n$ web documents.
  17. The MondoHashMap MUST support get(), put(), remove(), size(), and isEmpty() in amortized $ O(1)$ time. The table MAY support key/value iteration in time proportional to the capacity of the table. For extra credit, it MAY support key/value iteration in time proportional to the number of keys/values (respectively). To receieve the extra credit, the designer must demonstrate this convincingly in the performance documentation.
  18. The MondoHashMap MUST NOT consume more than $ O(n\cdot
s)$ memory for $ n$ distinct keys, where $ s$ represents the combined size of a key/value pair.
  19. The MondoHashMap MUST support the keySet() and values() operations with only $ O(1)$ space above that required by the hashtable itself. Specifically, these operations MUST NOT replicate the underlying hashtable, nor duplicate any keys or values.
  20. All user documentation MUST be grammatically correct and include correct spelling and usage. (You will be graded, in part, on the quality of your writing.)
  21. The designer MUST document any areas in which her or his software suite does not meet this specification. WARNING! The grade penalty will be higher if the instructors discover an undocumented program shortcoming or bug than if it is documented up front.

Terran Lane 2005-02-14