Common Options

Both client programs MUST support the following command-line options:

-m modelFileName
Load/save SPAM/NORMAL statistical models from/to the specified filename. If BSFTrain saves all statistical models to a single file, the filename should be modelFileName.stat; if it uses separate files for SPAM and NORMAL statistics, their respective file names should be modelFileName.sstat and modelFileName.nstat.
-k tokenizerName
This selects the type of tokenizer to be used in the current analysis. The program MAY also require additional command-line arguments specific to each tokenizer (e.g., the NGramTokenizer requires that the parameter $ n$ be set). Failyure to provide a necessary tokenizer-specific option MAY be treated as a RECOVERABLE or UNRECOVERABLE ERROR.

If a program detects a discrepency between the type of tokenizer specified by -k and the type of tokenizer previously used to construct the saved statistics tables (if any), it MAY choose to interpret this as a UNRECOVERABLE or (if possible) a RECOVERABLE ERROR. It MUST NOT crash or produce incorrect results because of this condition. It MUST NOT destroy, corrupt, or overwrite previously existing saved statistics file(s).



Terran Lane 2004-01-26