For grading purposes the Moogle client program MUST provide a query trace file mechanism. This mechanism will write the query, parse tree, and query result to a file for later manual and automatic analysis. This information MUST be recorded for every query issued by the user. This mechanism is in addition to the UI display of results described previously. The query trace file is meant primarily for machine consumption, rather than human, so the file format is designed to be simple for I/O rather than simple to read.
The Moogle client program MAY allow the user to select a query trace file or MAY use a fixed file name. If a fixed file is used, it MUST be named ``./[yourname].moogQuery.dat'', where [yourname] stands for the string ``your last name''+''your first initial''. E.g., the course TA's program might produce a file named ./brownj.moogQuery.dat.
While the designer is free to choose many factors about the UI, the Moogle client MUST produce the following format exactly.
The format of the query trace file is:
===== BEGIN QUERY ===== [query info] ===== END QUERY =====One ``QUERY'' entry MUST be present for each query issued by the user.
Each QUERY entry is formatted as follows:
=== BEGIN QUERY TEXT === [query text string, as typed in by user] === END QUERY TEXT === === BEGIN QUERY PARSE === [query parse] === END QUERY PARSE === === QUERY RESULTS === [query results] === END QUERY RESULTS ===
The QUERY TEXT STRING field MUST report the text string exactly as typed by the user. No whitespace, punctuation, etc. should be trimmed or added.
The QUERY PARSE MUST be formatted according to the following rules:
"word"with no additional whitespace. The WORD string MUST be printed in lower-case.
(AND [first query term] [second query term])with no whitespace beyond the single spaces separating AND and the first query term and between the first and second query term. Note that the query terms themselves may be WORDs or may be additional parenthesized phrases.
(OR [first query term] [second query term])with no whitespace beyond the single spaces separating OR and the first query term and between the first and second query term. Note that the query terms themselves may be WORDs or may be additional parenthesized phrases.
The QUERY RESULTS field MUST report each URL returned in response to the query, in order sorted by TF/IDF score, as well as the score itself. One URL/score pair MUST be reported per line. Each line is formatted as follows:
[url string] [TF/IDF score, to 2 digits of precision]
An example of such a file is available at http://www.cs.unm.edu/~terran/classes/cs351-s05/projects/p1/query_trace_example.dat
Terran Lane 2005-02-14