((C)) Key concepts of parsing and executing the SiteRiter SDL language ----------- ((C.0)) Table of contents (C.1) SiteRiter SDL parsing (C.2) Symbols and the symbol table (C.3) Page generation and symbol expansion (C.4) SiteRiter web server standard hit processing (C.5) ROBOTS.TXT processing (C.6) Reload processing (C.7) Secondary start symbols (C.8) Examples (C.9) Error handling ----------- ((C.1)) SiteRiter SDL parsing ((C.1.1)) SiteRiter SDL parsing occurs by repeatedly calling the lexer which sequentially reads chars from a Reader, and produces a sequence of tokens which are processed according to the SDL grammar rules and semantics. As grammar components are recognized, internal state (C.2) is accumulated to represent the parsed input. ((C.1.1.1)) In this spec, when describing the SiteRiter SDL parsing, all references to a 'character' or 'single character' or 'char' in contexts such as the (C.1.1) are to be interpreted as referring specifically to a single 16-bit Java code unit -- a single Unicode code point in the basic multilingual plane, represented in the UTF-16 encoding. ((C.1.2)) For maximum reusability, the input to lexical analysis is an java.io.Reader, nothing more specific than that. ((C.1.3)) As far as possible the basic SiteRiter lexing and parsing methods must not assume their input is coming from or going to any specific place, and instead they must be told, by their caller, where to read from. ----------- ((C.2)) Symbols and the symbol table ((C.2.1)) At a high level, the SDL parsing process can be viewed a recognizing a series of rules. Each rule has a name, a choice, and potentially some 'modifiers' such as a 'selector' or a 'start tag'. ((C.2.2)) Every instance of a class that implements SDLParser must be able to load any valid SiteRiter grammar, and then be able to produce, on demand (see (C.3) for details), pages that are syntactically valid according to the loaded grammar. ((C.2.3)) To support high-performance web server processing, the loaded grammar information must be stored in a form that can be accessed rapidly. A common way to do that is store the information associated with a single rule in an object called a 'symbol', and to store all the symbol objects associated with a given grammar in a 'symbol table'. Required properties of SiteRiter symbols and symbol tables include: ((C.2.3.1)) The symbol table can find the symbol defined for any given NAME -- if such a rule exists -- in approximately amortized constant time, in the number of defined symbols. ((C.2.3.2)) Similarly, the symbol table can store an association between a NAME and a symbol in approximately amortized constant time, in the number of defined symbols. ((C.2.3.3)) A symbol provides access to any of its contained information -- including to any sequence within a choice, and any token within a sequence -- in constant time, independent of the number of choices or tokens. ((C.2.3.4)) The symbol table can be 'cleared' as needed. Clearing a symbol table returns it to a state in which it contains no symbols. ((C.2.3.5)) Any particular symbol instance must be stored in only one symbol table, and any symbol table instance must be specific to a particular instance of a class implementating SDLParser. In particular, it must be possible (e.g., for testing) to have multiple distinct grammars loaded into the symbol tables of different SDLParser-implementing-class instances, and have no interference between those symbol tables or grammars. ((C.2.4)) The information accessible via a symbol, in constant time, includes the following: ((C.2.4.1)) The name of the symbol, and whether it is flagged as a secondary start symbol (C.7). ((C.2.4.2)) The choice set defined for the symbol, from which any of the contained sequences can be accessed (in constant time), and then from a contained sequence any of its sequence tokens can be accessed (in constant time). ((C.2.4.3)) The selector associated with the rule, if any. See (C.3.4.2.1). ----------- ((C.3)) Page generation ((C.3.1)) It is the ability to combine 'choices' with 'sequences' in SDL programs that leads to SiteRiter's prodigious ability to generate giant web sites from a small rules files. By placing rules-with-choices in sequences with rules-with-choices, the number of legal strings in the grammar grows explosively. ((C.3.1.1)) For example, a grammar like this: foo = "a"; can only generate one possible page, containing a single 'a'. ((C.3.1.2)) A grammar like this: foo = "a" | "b" | "c"; can generate three possible pages: 'a', 'b', or 'c'. ((C.3.1.3)) But by combining choice and sequence, even a tiny, two-rule grammar like this: foo = bar bar bar; bar = "a" | "b" | "c"; can generate _twenty-seven_ possible pages: aaa aab aac baa bab bac caa cab cac aba abb abc bba bbb bbc cba cbb cbc aca acb acc bca bcb bcc cca ccb ccc ((C.3.1.4)) Not to mention that a grammar like this, containing a mere three rules: a = b b b b; b = c c c c; c = "1"|"2"|"3"|"4"; can generate over four billion unique pages, which will not be listed here. ((C.3.2)) With such explosive power packed into an SDL grammar, you might think it would be difficult for SiteRiter to decide _which_ of all the possible legal pages to generate for any given URL. But, not so! ((C.3.2.1)) Intensive proprietary research and focus group studies by our crack 'independent' research teams has shown that what matters most for branding, marketing, and commercial success -- _all_ that really matters, in fact -- is maximizing the 'diversified impressions delivered' (the 'DID') -- in other words, presenting your critical marketing concepts in many many different contexts -- which is precisely what SiteRiter can do. ((C.3.2.2)) What is actually _worse than useless_, our exclusive unquestionable studies show, is trying to make any particular context 'be logical' or 'make sense', despite what you might hear from an embittered dwindling minority of overpriced, 20th century, fuddy-duddy academic intelligentsia elitist hack snobs that call themselves 'copy writers' or 'creative talent' or similar fluff. ((C.3.2.3)) To the contrary, our 100% valid studies show that attempting to put concepts in a 'sensible order' using 'thinking' and 'reason' is in fact a _major risk factor_ impairing brand impression success, because it produces FAR TOO FEW diversified contexts. Instead of using 'thought', it would be far better, if only it could be done, to generate web pages completely AT RANDOM. ((C.3.2.4)) But: Is it really possible to achieve such ideal randomness in a practical way? Hiring sufficient staff to roll dice or shuffle cards during web page generation would be both costly and slow. And worse, those 'random number generator' employees wouldn't necessarily generate the _same page_ in response to the _same URL_! User bookmarks would become useless! ((C.3.3)) SiteRiter solves all those problems. SiteRiter uses an advanced random number generator (RNG) that's completely computerized, so you save big money by firing all those untrustworthy gamblers. But it's really a special kind of RNG, called a 'pseudo' random number generator (PRNG), which means that you can generate the same page from the same URL over and over again, whenever it is needed. Bookmark away! ((C.3.3.1)) The PRNG produces a series of numbers that _seem_ random, but really aren't. During page generation, whenever a choice is expanded (that doesn't involve a selector), an apparently random number is obtained from the PRNG to decide which sequence to expand within the choice. ((C.3.3.2)) What determines the series of numbers produced by a PRNG is its 'seed'. Same seed, same series of numbers; different seed, different series of numbers. ((C.3.3.3)) The SiteRiter page generation mechanism derives the PRNG seed from the 'key' supplied to the page generation software, using a 'String hashCode' mechanism to produce a number from an arbitrary string. Same key in, same page out; different key in, different page out (almost surely, assuming enough choice in the Rules File). Using a key that depends on the URL visited by the user, the mechanism is complete ((C.3.3.4)) Overall, page generation involves three phases -- work done setting up for the call to SDLParser.makePage, work done during that call, and work done after it. The set-up work involves developing the arguments for the call: ((C.3.3.4.1)) Given a requested URL, the key is formed by removing a prefix consisting of the protocol ('http:'), the hostname (e.g., 'localhost'), and the optional port number specification (e.g., ':8000'). The rest of the URL, beginning with a leading '/', becomes the key supplied as the first argument to the implementation of SDLParser.makePage. ((C.3.3.4.2)) An empty selector mapping is created, which becomes the second argument supplied to the implementation of SDLParser.makePage. ((C.3.3.5)) During the call to SDLParser.makePage, the following occurs: ((C.3.3.5.1)) Initialize a PRNG with a seed given by the hashCode of the key. ((C.3.3.5.2)) Determine the start symbol for the page. By default, the page start symbol is the name of the first rule in the Rules File, but that is overridden: ((C.3.3.5.2.1)) If the supplied key begins with "/ss/", AND ((C.3.3.5.2.2)) That leading "/ss/" is followed by a non-empty string of chars that does not include a '/', AND ((C.3.3.5.2.3)) That non-empty string is then followed by a '/' followed by zero or more other chars, AND ((C.3.3.5.2.4)) That non-empty string is the name of symbol that was (1) defined in the Rules File, and (2) flagged as a secondary start symbol using the '[ COLON ]' prefix as shown in (G.3.1.3). ((C.3.3.5.2.5)) If ALL of (C.3.3.5.2.1)-(C.3.3.5.2.4) are true, then the 'non-empty string' of (C.3.3.5.2.2) and (C.3.3.5.2.3) becomes the page start symbol; otherwise, the default start symbol is used. ((C.3.3.5.3)) The selected start symbol is then expanded, using the random number generator initialized in (C.3.3.5.1), and the selector map supplied in (C.3.3.4.2), with the output stored in a String, and returned as the result of the SDLParser.makePage call. See (C.3.4) for details of the expansion process. ((C.3.3.6)) After the call to SDLParser.makePage has returned, the SiteRiter web server code returns the page to the user, with an appropriate HTTP header. ((C.3.4)) Details of symbol expansion ((C.3.4.1)) Expanding a symbol involves, first, choosing one of the sequences it contains, and then second, expanding each sequence token of the chosen sequence, in turn. If there are no sequences in a choice, expanding a symbol produces no output; the rest of this description assumes there is at least one sequence. ((C.3.4.2)) Choosing the sequence ((C.3.4.2.1)) There are two possible processes -- 'independent' or 'selected' -- for choosing which sequence of a symbol to expand. The selected method (C.3.4.2.1.2) is used when the symbol was defined with an optional selector clause (the '[ COLON NAME]' portion of (G.3.1.2)), and the independent method (C.3.4.2.1.1) is used otherwise. ((C.3.4.2.1.1)) Independent sequence choosing method ((C.3.4.2.1.1.1)) A random number is drawn from the PRNG in the range of 0..(number of sequences-1), and that sequence is chosen (0 chooses the first sequence, 1 chooses the second, etc.) ((C.3.4.2.1.2)) Selected sequence choosing method ((C.3.4.2.1.2.1)) When the selected sequence choosing method is used, the symbol being expanded has a 'selector' name defined for it. (That selector symbol is given by the NAME in the optional '[ COLON NAME ]' portion of the grammar rule (G.3.1.2) that defined the symbol.) ((C.3.4.2.1.2.2)) The selector name of the symbol is looked up in the selector map. If it is not found, then: ((C.3.4.2.1.2.2.1)) A random number in the range of 0..Integer.MAX_VALUE-1 is drawn from the PRNG, and that number is stored in the selector map as the value of the selector name. Then processing proceeds as if the selector name had been found in the map to begin with. ((C.3.4.2.1.2.3)) If the selector name is (now) found in the selector map, its associated Integer value is obtained, and that value, modulo the number of sequences in the choice, indicates the sequence that is chosen (0 modulo choice size means the first sequence is chosen, 1 modulo choice size means the second sequence is chosen, etc.) ((C.3.4.3)) Expanding the chosen sequence ((C.3.4.3.1)) Once the sequence has been chosen, it is expanded. Expanding a sequence involves expanding each of its sequence tokens, in turn from left to right. Due to (G.3.1.5), only three types of tokens can appear in a sequence: NAME, SLITERAL, or DLITERAL. ((C.3.4.3.1.1)) Expanding a NAME in a sequence is accomplished in two steps: ((C.3.4.3.1.1.1)) First, the value of NAME is looked up in the symbol table, where an associated symbol either will or will not be found. ((C.3.4.3.1.1.2)) If an associated symbol is _not_ found, the value of the NAME token itself is output, followed by a single '?' char, and the process of expanding that NAME token is then complete. ((C.3.4.3.1.1.3)) If an associated symbol _is_ found, then that symbol is expanded recursively (C.3.4), and when that recursive process returns, the process of expanding that NAME token is then complete. ((C.3.4.3.1.2)) Expanding a SLITERAL adds zero or more chars to the output, and is accomplished by outputting the value of the SLITERAL exactly, except that the leading and trailing single quote chars ('\'') are removed. ((C.3.4.3.1.3)) Expanding a DLITERAL adds zero or more chars to the output, and is accomplished by outputting the value of the SLITERAL exactly, except that the leading and trailing double quote chars ('"') are removed. ----------- ((C.4)) SiteRiter web server standard hit processing ((C.4.1)) See spec-server.txt for the overall flow of web server operations, and particularly (S.1.4) for details of connection processing. ((C.4.2)) The SDLParser.makePage() operations occur during the 'respond(String)' call described in (S.1.4.2.4). In the example server, that leads into ExampleConnection, which contains a makePage call. ----------- ((C.5)) ROBOTS.TXT processing ((C.5.1)) Since a SiteRiter website is capable, with a suitable Rules File, of generating a nearly endless set of web pages, it is REQUIRED that SiteRiter web servers support the 'Robots Exclusion Protocol' (as described, e.g., at http://www.robotstxt.org/robotstxt.html. ((C.5.2)) By default, a SiteRiter website MUST EXCLUDE ALL ROBOTS from the entire site, as this is the only safe choice, both for the sake of the robots and for the performance of the SiteRiter site. ((C.5.2.1)) The example server, as written, already performs this exclusion adequately; nothing more beyond the example server behavior is required. ((C.5.3)) However, as an OPTIONAL, EXTRA-CREDIT, STEP-UP FEATURE, a SiteRiter implementation MAY choose to provide a more advanced 'robots.txt' processing that must work precisly as follows: ((C.5.3.1)) When (and only when) the 'request' in an AbstractConnection.respond() call is "/robots.txt", 'advanced robots.txt processing' will occur. ((C.5.3.2)) When 'advanced robots.txt processing' occurs, SiteRiter will check in the currently-loaded Rules File to see if there is a rule named "ROBOTS_TXT", precisely as spelled, case-sensitive. ((C.5.3.3)) If there is _no_ such rule, then 'advanced robots.txt processing' acts just like the unmodified example server, issuing an response that EXCLUDES ALL ROBOTS. ((C.5.3.4)) On the other hand, if there _is_ such a rule, then 'advanced robots.txt processing' produces a response by making a page using the currently-loaded Rules File _except_ that it acts as if the ROBOTS_TXT rule was the start symbol for the grammar. ((C.5.3.5)) After responding with whatever output is produced by (C.5.3.4), 'advanced robots.txt processing' is complete. ----------- ((C.6)) Reload processing ((C.6.1)) Normally, a SDL Rules File is loaded once during the startup processing of a SiteRiter server, and is then retained in memory in parsed form for as long as the server runs, ready to be used to generate web pages as needed. ((C.6.2)) However, particularly during the development or maintainence of an SDL Rules File, it can be a minor inconvenience to have to restart the SiteRiter server whenever the Rules File needs to be reloaded. ((C.6.3)) Therefore, as an OPTIONAL, EXTRA-CREDIT, STEP-UP FEATURE, a SiteRiter MAY provide 'advanced reload processing' that must work precisely as follows: ((C.6.3.1)) When (and only when) the 'request' in an AbstractConnection.respond() call is "/reload", 'advanced reload processing' will occur. ((C.6.3.2)) When 'advanced reload processing' occurs, SiteRiter will perform the following steps in turn: ((C.6.3.2.1)) SiteRiter will temporarily block any new connection requests from generating pages. (The connections will not be dropped, broken, or rejected, they will simply be delayed.) ((C.6.3.2.2)) SiteRiter will allow any pages that are in the process of being generated to complete. ((C.6.3.2.3)) Once no connections are using the loaded grammar, SiteRiter will RELOAD the same file path that it loaded previously (at startup or during a prior 'advanced reload processing'. ((C.6.3.2.3.1)) If a parse error occurs during this reload, SiteRiter will exit. ((C.6.3.2.4)) After the Rules File has been successfully reloaded, SiteRiter will resume generating pages in response to connections, including any connections that may have been temporarily blocked during 'advanced reload processing'. ((C.6.4)) Implementors should note that getting 'advanced reload processing' right is generally a non-trivial challenge, not to be attempted lightly. ----------- ((C.7)) Secondary start symbols ((C.7.1)) In a SiteRiter SDL grammar, the first rule of a Rules File determines the 'default start symbol' -- the symbol that will be used to generate pages for most URLs, that do _not_ follow the special pattern described in (C.3.3.5.2). ((C.7.2)) However, sometimes the default start symbol mechanism is inadequate to express very complex generated web sites. For example, in our modern global era it might be desirable to have different sections of the web site in different languages, and to have a way to construct URLs that will be certain to generate a DID (C.3.2.1) in the appropriate language. ((C.7.3)) Secondary start symbols make that possible. The mechanism for encoding a secondary start symbol into an URL (C.3.3.5.2) ensures that any URL of that form will be generated in the desired way; the requirement that secondary start symbols be explicitly flagged -- (C.3.3.5.2.4) and (G.3.1.3) -- in the rules file ensures that _only_ those rules that represent appropriate complete pages can be used as start symbols. ----------- ((C.8)) Examples ((C.8.1)) See (I.3) for some basic examples. Here, we have a few more simple examples illustrating various legal inputs. ((C.8.1.1)) Big and small names. Names can be as little as one char long, and have no fixed maximum length, other than Integer.MAX_VALUE, until available memory is exhausted. ----BEGIN-INPUT---- s = "hi " bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb; bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb = "there "; -----END-INPUT---- generates: ----BEGIN-OUTPUT---- hi there -----END-OUTPUT---- ((C.8.1.2)) Significant and insignificant whitespace. Whitespace -- including spaces, tabs, newlines, and anything that the Java character class treats as whitespace -- serves to delimit names, and is preserved inside LITERALs, but its presence or absence is otherwise irrelevant. Also note that no trailing 'newline' is required at the end of a Rules File. ----BEGIN-INPUT---- s = "spaces preserved inside" literals ; literals="no space needed after a literal"'though'"; and sliterals and dliterals can be found 'cheek to jowl' with no problems ";-----END-INPUT---- ----BEGIN-OUTPUT---- spaces preserved insideno space needed after a literalthough; and sliterals and dliterals can be found 'cheek to jowl' with no problems -----END-OUTPUT---- ((C.8.1.2)) Undefined symbols. If a NAME appearing in a SEQUENCE being expanded does not have a corresponding symbol definition, it is handled as per (C.3.4.3.1.1.2): ----BEGIN-INPUT---- s = "To be, or not to be, that is the " question; Question = "question. Whether 'tis noblah blah.."; -----END-INPUT---- ----BEGIN-OUTPUT---- To be, or not to be, that is the question?-----END-OUTPUT---- ((C.8.1.3)) Symbol redefinitions. Rules with the same name can appear more than once in a Rules File, in which case the only last-appearing rule definition can have any effect on page generation. ----BEGIN-INPUT---- s = ReadEverythingBeforeDoingAnything; ReadEverythingBeforeDoingAnything = step1 step2 step3; step1 = "Hop" | "Pop"; step2 = step1 step1 step3; step3 = "on" | "off"; s = "Done"; -----END-INPUT---- ----BEGIN-OUTPUT---- Done-----END-OUTPUT---- ((C.8.2)) An HTML-generation example ((C.8.2.1)) The examples so far have used relatively little literal text, but real web sites will typically use literal text in several main ways: (1) To represent the key branding and marketing concepts that are to be impressed by the web site; (2) To represent context surrounding the key concepts -- the 'supporting actors' behind the 'stars', so to speak; and (3) To represent the 'boilerplate' needed for formatting and so on. This larger example illustrates all of those uses. ----BEGIN-INPUT---- page = " " pitch " "; product = "Foo Magic" | "Blurred Bar" | "an Essay"; pitch = buy " " product "! " reason; buy = "Get" | "Buy" | "You need" | "Your family needs"; reason = "Be like " star "!"; star = "Rick Thobin" | "Green Glenwald" | "Jake the Dog" ; -----END-INPUT---- A couple possible outputs are: ----BEGIN-OUTPUT1---- You need Foo Magic! Be like Jake the Dog! -----END-OUTPUT1---- ----BEGIN-OUTPUT2---- Buy Blurred Bar! Be like Green Glenwald! -----END-OUTPUT2---- ((C.8.3)) Selector examples ((C.8.3.1)) Although ironclad research (see the often-cited (C.3.2) for details) shows that maximizing DID is the most important factor for success in life, there can be limited and narrow occasions when a client may insist on _reducing_ the possible diversity of a web site, in hopes of having it make 'more sense'. Although experienced SiteRiters know that words like 'logic' and 'sense' and 'rational' are all danger signs, the SiteRiter SDL nonetheless provides the 'selector' mechanism to allow such 'increased sense' to be produced where it absolutely must. ((C.8.3.2)) A typical example could be for a (old-school) client that insists on gender agreement. For them, this grammar would be problematic: ----BEGIN-INPUT---- page = subject rxverb self '. '; subject = "Bob" | "Mary" | "The robot"; rxverb = "loved" | "hated" | "disgraced"; self = "himself" | "herself" | "itself"; -----END-INPUT---- because although it produces output like ----BEGIN-OUTPUT1---- Bob hated himself. -----END-OUTPUT1---- it might also produce output like: ----BEGIN-OUTPUT1---- Mary loved himself. -----END-OUTPUT1---- to which some might object (until we all switch to Mandarin). ((C.8.3.2.1)) In that example, what is needed is a way of forcing agreement between the choice made in the 'subject' rule, and the choice made in the 'self' rule -- and a selector provides precisely that. Consider this grammar: ----BEGIN-INPUT---- page = subject rxverb self '. '; subject:g = "Bob" | "Mary" | "The robot"; rxverb = "loved" | "hated" | "disgraced"; self:g = "himself" | "herself" | "itself"; -----END-INPUT---- in which the 'subject' and 'self' rules have both been given the 'g' selector (while the rxverb rule has not). During the expansion process, when 'subject' is expanded, the rules detailed in (C.3.4.2.1.2) apply, and a random number in the range of 0..Integer.MAX_VALUE-1 is chosen and associated with the selector 'g'. That value is then used (modulo 3) to select the 'subject', and then again (also modulo 3) to select the 'self'. ((C.8.3.2.2)) As a result, this grammar can produce _only_ the following nine outputs: Bob loved himself. Mary loved herself. The robot loved itself. Bob hated himself. Mary hated herself. The robot hated itself. Bob disgraced himself. Mary disgraced herself. The robot disgraced itself. clearly illustrating how this sometimes (client-is-always-right) necessary increase in 'sense' is accompanied by an inevitable failure to maximize DID. ((C.8.3.3)) Note that rules using the same selector do not have to have the same number of alternatives. On the contrary, it can be useful to have the number of choices in one rule be a multiple of the number of choices in another rule, when the two rules use a common selector. ((C.8.4)) Secondary start symbol example ((C.8.4.1)) The mechanism described in (C.3.3.5) provides a way to associated certain classes of URLs with certain rules in a Rules File, providing a way to completely different types of pages within a single generated web site. Here is an extended example illustrating the mechanism, generating a small web site capable of supporting two different languages, with 'header links' to choose between the languages, and internal links on web pages that maintain the selected language: ----BEGIN-INPUT---- page = en; hostport = "localhost:8000"; :en = top enpage bot; :pl = top plpage bot; top = ' Languages: English Pig Latin
'; enpage = ensubj enobj '.'; plpage = plsubj plobj '!'; d = "0"|"1"|"2"|"3"|"4"|"5"|"6"|"7"|"8"|"9"; ds = d d d d d d d d; enslink = ''; enelink = elink; plslink = ''; plelink = elink; elink = ""; ensubj = "Foo" | "Bar" | "Baz" | "Mumble" ; plsubj = "Oofay" | "Arbay" | "Azbay" | "Umblemay"; enobj = " is " enslink enadj enelink; plobj = " isay " plslink pladj plelink; enadj = "gah"|"hop"|"zot"|"borked"; pladj = "ahgay"|"ophay"|"otzay"|"orkedbay"; bot = " "; -----END-INPUT---- One possible output generated by this rules file could be: ----BEGIN-OUTPUT---- Languages: English Pig Latin
Umblemay isay orkedbay! -----END-OUTPUT---- ----------- ((C.9)) Error handling ((C.9.1)) All implementations of the SDLParser.load(Reader) method are required to follow the contract specified in the associated JavaDoc. ((C.9.2)) In particular, and without limitation, a com.putable.siteriter.SDLParseException must be thrown (preferably with an information message) whenever an attempt is made to load a Rules File that is syntactically incorrect in any way. ((C.9.3)) Similarly, _no_ exception may be thrown on _any_ syntactically valid grammar (unless a JVM-internal memory capacity limit is encountered).