The rule has three parts. First is the name. Second is the pattern that must be matched by an evaluated string in order to have the optional third part, the translation, applied to it. The pattern is comprised of terminals and other rules. Terminals and rules are separated by a comma and a space. Rule names are denoted by being placed within double quotes (ie, "rule name"). A terminal is often a fixed set of characters and is denoted (for the purposes of this engine) by being placed within front slashes "/", like a regular expression in Perl. This is due to the fact that regexp is used to match the characters in a string at the terminal level, allowing the use of expressions such as "\d*" to be a terminal denoting a string of digits zero more in length. Use the repetition with care, as it can be considered a sloppy and careless way to denote a terminal symbol. On the plus side, it can be used as a shortcut requiring fewer rules.
To give an example, the following set of rules covers strings with one "b" surrounded by as many "a"s on the left as there are on the right.
Example1
Name Pattern Translation
S1 /a/, "S2", /a/
S2 /b/
S2 "S1"
If there are multiple rules with the same name, the string will be evaluated using the rules in the order they are entered until a match is found or there are no more rules. For example, consider the string "aabaa" with the grammar above. Rule "S1" is used. The string has an "a" at the front and end: "aS2a". Does the center part of the string, "aba", fit rule "S2"? The first S2 rule is evaluated. "aba" does not fit the first definition of b. What about the next S2 definition, which is S1? It starts with and ends with an "a", so far so good. What about the middle part of the string (which is "b")? "b" fits the first definition of S2, so the string "aabaa" fits rule S1. In other terms:
aabaa => aaS2aa
aaS2aa => aS1a
aS1a => aS2a
aS2a => S1
And in the other direction:
S1 => aS2a
aS2a => aS1a
aS1a => aaS2aa
aaS2aa => aabaa
To complicate things slightly, the pattern elements of a rule may be repeatable. This allows arrays of elements, such as parameters to a find_first, to be translated more easily, and not just matched. It is possible to write a grammar without using the repetition, however some translations may be difficult if not impossible to correctly achieve without it (this will be explained further in the translation section). To specify that a rule definition may repeat, the final pattern element is denoted as a regexp in square brackets. For example, the following grammar matches strings such as "abb", "abbb,abbbbbb,abb","abb,abb,abb" etc:
Example2
Name Pattern Translation
S1 /ab/, "S2", [/,/]
S2 /b/, "S2"
S2 /b/
The third part of the rule is the translation. By default, if no translation is defined, then no translation will occur. A translation will also not take place for a string which does not fit match the rules of the grammar. The translation may take evaluations of the pattern elements and do one of three things: leave the evaluation in the order it occurs, move it to another location, or remove it entirely. It is also possible to put arbitrary strings between elements. Each of the pattern elements of the rule has an index number, starting with 1 being the first element, 2 the second and so forth. Looking back at Example1, there are three indexes for the three rule elements. Consider Example3:
Example3
Name Pattern Translation
S1 /a/, "S2", /a/ 1, 3, 2
S2 /b/
S2 "S1"
This will match symmetrical strings of a's with one b in the middle, and return a translated string such that the b is now at the leftmost position in the string. Ie, "aabaa" would be translated to "aaaab". Translating repitions is slightly more complicated. To denote part of a translation that will possibly repeat, enclose it within square brackets "[]". There must be at least one index, and can also be any amout of arbitrary strings. This repetition part will start with the first encounter of the index unless the first element in the repetition part is a semicolon followed by a lowercase letter, ie ":b" to start with the second repetition. ":a" to start at the first repetition, ":b" to start at the second, ":c" for the third, ":d" for the fourth, etc. This allows the first repitition (or initial repititions) to be handled differently during translation. This is illustrated in the next example:
Example4
Name Pattern Translation
func arr "param key", /\=>/, "param val", [/,/] 1, "= ?", [:b, " and ", 1, "= ?"], "'", [", ", 3]
The indexes that can be used in the repetition parts of the translation are the index numbers of the parts of the rule. In a repitition part of the translation, the index stands for the index of the base rule, and will be repeated for each iteration. Assumming rules for "param key" and "param val" are already defined (and the "param key" rule pulled off the initial colon), and given the string
":name => $name, :user_id => 2, :active => true"
the above rule would translate it to:
"name = ? and user_id = ? and active = ?', $name, 2, true"
More graphically the translation is seen as:
repitition: 1 2 3
rule index: 1 2 3 4 1 2 34 1 2 3
orig. string: :name => $name, :user_id => 2, :active => true
repitition: 1 2 3 1 2 3
tx. string: 1 "= ?" and " 1 "= ?" and " 1 "= ?"'", " 3 ", " 3", "3
tranlsated: |name |= ?| and |user_id |= ?| and |active |= ?|'|, |$name|, |2|, |true
(NOTE: pipes, |, separate elements of the translated string)