CHAPTER 25 GENERATING PHRASES IN ANOTHER LANGUAGE Following are new notations for expressing phrase generation. They may be used within semantic specifications to generate phrases in another language. These notations let you, the programmer, specify translations ~oblivious ~to ~the ~existence ~of ~any ~ambiguity ~at ~all. Ambiguity is supported implicitly, and delightfully, it's relatively easy to implement. We will see the implementation of each notation after we present it. (Section 25.5 shows how ambiguities between domains meet). Our main example has the syntax grammar's semantics generate phrases in the datatype language. This is how we require that a specification make sense syntactically ~and within the domain of datatypes. The next chapter shows the complete specification of syntax rules whose semantics generate phrases in the datatype language. There, we will see an implementation of a programming language that involves datatypes. That implementation will use the notations presented in this chapter. 25.1 Notation For Generating Unit-Length Phrases We now introduce notations for the generation of phrases. We've been using one of the most important of these notations all along. Recall that a rule is usually specified via the notation: <POS(1):V(1)> ... <POS(k):V(k)> -> <GIVE_POS: f(V(1),...,V(k))> In general, we allow the righthand side of the rule to be a general program, as in: <POS(1):V(1)> <POS(2):V(2)> ... <POS(k):V(k)> -> a program ---------------- Parentheses in previous paragraph mean subscripting! --- 25.2 The Righthand Side Of A Rule Is Always A STATEMENT In fact, ~the ~righthand ~side ~of ~a ~rule ~is ~always ~a ~STATEMENT. We've been living with this since Section 1.3. The notation: <POS: f(...) > is actually a STATEMENT! 25.2.1 The Righthand-Side Of A Rule Can Naturally Involve IFs Because the give-phrase notation "<ID:STATEMENT>" is itself a STATEMENT, we are able to express rules (from Chapter 3) like: <EXPR[i]:x> <BOP[j]:y> <EXPR[k]:z> -> IF i =< j & k < j THEN <EXPR[j]: f(x,y,z) > FI This rule uses the IF-THEN notation for STATEMENTs. This is perfectly valid because the "<EXPR:...>" in the THEN clause is a STATEMENT, and the righthand side of a rule is expected always to be a STATEMENT. 25.3 Phrase Generation Is A STATEMENT Anywhere, Even Beyond Rules' Righthand Sides In general, phrase generation STATEMENTs are supported by the rules: < ID : STATEMENT > -> STATEMENT < ID [ EXPR ] : STATEMENT > -> STATEMENT The ID names the part-of-speech. The given STATEMENT specifies the semantics. These translate ultimately into a call to GIVE (Section 25.3.3 and Chapter 12). Each generates a unit-length phrase whose part-of-speech is the ID and whose semantics is the STATEMENT. We've seen this on the righthand sides of rules. The second rule here is for specifying array parts-of-speech, which require an index (the EXPR, of type INT) to denote which ~one of the array of parts-of-speech is intended for the new phrase. For example, consider the STATEMENT forming the righthand side of the rule: <NUMBER:n> -> <EXPR: LOAD( 1, ADDRESS_OF(n) ); > That STATEMENT: <EXPR: LOAD( 1, ADDRESS_OF(n) ); > generates an <EXPR> whose semantics is the STATEMENT within the "<...>", the: LOAD( 1, ADDRESS_OF(n) ); 25.3.1 Semantics Can Also Be An EXPR The STATEMENT within the angle-brackets can also be an EXPR, as in: < ID : EXPR > -> STATEMENT < ID [ EXPR ] : EXPR > -> STATEMENT In general, parts-of-speech declared with the dash, e.g., POS EXPR : - ; must be specified using the STATEMENT form inside the "<...>". All other parts-of-speech, declared like: POS ID : TEXT ; where the semantics is a specified datatype, use the EXPR form inside the "<...>". The ":STATEMENT" or ":EXPR" may be omitted, as in: < ID > In the absence of semantic specification, the default semantics will do nothing when it might be invoked, and will return 0, FALSE, or NIL if it is used as a value. BOX: What notation have we been using all along that BOX: generates a phrase? BOX: BOX: Can this notation be used anywhere that a STATEMENT BOX: is permissible? 25.3.2 The STATEMENT or EXPR Inside The "<...>" Is Rendered As A Process Recall from Chapter 4 that we implicitly enclose any semantic specification within the "//...\\" to render it as a process. This was done to render all semantics as ~delayed semantics, so that ~no semantics would be executed during the parsing action. We implement that delaying transformation here. When we specify (e.g., on the righthand side of a rule): < POS : f(x,y,z) > we actually deliver: < POS : //[x;y;z;] f( <*x*>, <*y*>, <*z*> ) \\ > The variables enclosed in the "//[...]", the context variables X, Y, and Z, come from the lefthand part of a rule, the want-phrase. The "//[...]" and "\\" are implicit. This delays the invocation of f. Also, each appearence of those context variables in the body is enclosed by the process invocation notation "<*...*>" if needed. This ~undelays the delayed semantics associated with the variables X, Y, and Z. 25.3.3 The Translation Of The "<...>" Into A Call To GIVE The STATEMENT: < ID : STATEMENT > always translates to: GIVE( [ LEFT:LEFT POS:ID SEM: //[...] STATEMENT \\ ] ); This is a full call to GIVE, like we saw in Chapter 15. Beyond the given ID (part-of-speech) and STATEMENT (semantics), the global variable ~LEFT is read. LEFT is one of the variables read implicitly by the "<...>" notation. Once GIVE is called, GIVE reads and writes the variable C as well, even though we don't pass it in here (see Section 12.3 or 15.2.1). LEFT and C are thus read implicitly by the STATEMENT: < ID : STATEMENT > The following introduces ways to manage these two implicit variables. BOX: BOX: What does the phrase generation notation "<...>" BOX: translate into? BOX: BOX: Could the righthand side of a rule consist of WHILE BOX: statements, among other things? BOX: BOX: How has an IF statement on the righthand side been BOX: helpful? BOX: BOX: How is Chapter 4's implicit delayed semantics BOX: implemented? 25.4 The Two Endpoints Of Generated Phrases: The Global Variables LEFT and C Whenever a phrase is generated, whether it be on the righthand side of a rule, or as any action, two global variables define the left and right endpoints for the new phrase.LEFT, a CHOICE_OF_PHRASES, denotes the left neighbor for a generated phrase. The variable C, also a CHOICE_OF_PHRASES, denotes the righthand edge for the generated phrase. Figure 25.1 illustrates this. The "<...>" notation, the GIVE, reads these two variables implicitly. Thus, the values in those two variables are important whenever we see a "<...>". How are these variables set? 25.4.1 A Rule's Want-Phrase Sets LEFT LEFT is set upon matching a want-phrase. Recall that a rule: <POS(1)> <POS(2)> ... <POS(k)> -> some_action is turned into a program via the following (from the end of Section 12.3.3). P is the PHRASE block passed to GIVE (and the grammar): P(k):= P ; IF P(k).POS = POS(k) THEN FOR P(k-1) $E P(k).LEFT; WITH P(k-1).POS = POS(k-1) ; !! ... !! FOR P(1) $E P(2).LEFT; WITH P(1).POS = POS(1); DO ~LEFT:= ~P(1).LEFT; some_action END FI The "some_action" is executed in a context where LEFT points to the matched phrase's lefthand neighbor. ---------------- Parentheses in previous paragraph around "1", "2" ---------------- "k", and "k-1" mean subscripting! --------------------- The action is usually: GIVE( [ LEFT: LEFT POS: the_righthand_part_of_speech SEM: //...\\ ] ); This is so if the righthand side of the rule is of the form: < ID : STATEMENT > This action now executes in the context where LEFT is the lefthand neighbor of the matched want-phrase. As always, C holds the rightmost PHRASE block of the matched phrase (Section 12.3.1). GIVE will put the newly generated phrase's rightmost block also onto C. Thus, the newly generated unit-length phrase ~shares ~the ~same ~span (LEFT and C) with the matched want-phrase. BOX: What do the variables LEFT and C signify? BOX: BOX: Why does the generated give-phrase (in a context-free BOX: rule) share the same span with the matched occurence BOX: of the rule's want-phrase? 25.4.2 Phrases Of Length Greater Than One The treatment of the global variables LEFT and C that we saw with general rewrite rules (Section 12.5) and the taking of user input (Section 12.4), are now rendered implicit with a new notation. 25.4.2.1 Brief Notation For Setting LEFT and C STATEMENTs may be combined in a way other than sequential execution. The "-" may be used to combine STATEMENTs, as supported by the rule: STATEMENT - STATEMENT - ... - STATEMENT -> STATEMENT The dashes deal with the variables LEFT and C. LEFT and C are set up especially for each of the individual STATEMENTs. This dash notation is meant to be used for generating phrases of length greater than one. For example: <POS(1)> - <POS(2)> combines with a dash the two STATEMENTS: <POS(1)> and <POS(2)> This forms a phrase of length two. ---------------- Parentheses in previous paragraph mean subscripting! --- A general rewrite rule is actually written as: <POS(1)> ... <POS(k)> -> <GIVE_POS(1)> - ... - <GIVE_POS(n)> where dashes separate the phrase elements on the righthand side of the rule. This is our official notation for general rewrite rules (rather than the dashless notation used in Section 1.3). ---------------- Parentheses in previous paragraph around "1" and "k" ---------------- mean subscripting! ----------------------- How is the rule: STATEMENT - STATEMENT - ... - STATEMENT -> STATEMENT implemented? It translates (see Section 12.5) to the following. (Back there, each STATEMENT was a call to GIVE): HOLDING LEFT; "(Preserve LEFT so that it looks like we never modified it)" DO RIGHTHAND:= C; "Remember present C, for our final STATEMENT" C:= NIL; ~STATEMENT(1) LEFT:= C; C:= NIL; ~STATEMENT(2) LEFT:= C; C:= NIL; ~STATEMENT(3) ... LEFT:= C; C:= NIL; ~STATEMENT(n-1) LEFT:= C; C:= RIGHTHAND; ~STATEMENT(n) ENDHOLD Assuming that each of the STATEMENTs is of the form: < ID : STATEMENT > then the overall effect is to generate the entire phrase (connected by dashes) with LEFT as its left neighbor, where the rightmost phrase block is appended onto C. ---------------- Parentheses in previous paragraph around "1", "2" ---------------- "3", and "n" and "n-1"" mean subscripting! ------------ That is, where we wrote: < ID : STATEMENT > to generate a phrase of length one, spanning from LEFT to C, we can now write: <ID : STATEMENT> - <ID : STATEMENT> - ... - <ID : STATEMENT> to generate a phrase of length greater than one, also spanning from LEFT to C. NOTE: STATEMENTs connected with dashes bind together ~before STATEMENTs separated by nothing (our usual way for putting STATEMENTs together). Thus, <A> <B> <C> - <D> <E> groups as: <A> <B> (<C> - <D>) <E> Also: <A> - <B> <C> - <D> - <E> <F> - <G> groups as: (<A> - <B>) (<C> - <D> - <E>) (<F> - <G>) BOX: BOX: How do you combine STATEMENTs so as to generate a BOX: phrase of length greater than one? BOX: BOX: Is the implementation identical to how we generate BOX: the give-phrase of a general rewrite rule? 25.4.3 Ambiguous Phrase Generation: Phrase Generation Without The Dashes
Figure 25.2(a) shows what is built when you execute the statement: <X> - <Y> - <Z> Figure 25.2(b) shows what: <X> <Y> <Z> generates. With the dashes, we get a left-to-right phrase. Without the dashes, the three short phrases become one ambiguous phrase. They all become members of the same CHOICE_OF_PHRASES. This is natural because the dashless STATEMENT calls GIVE once per short phrase, without touching C nor LEFT. GIVE puts all its given phrases onto the same CHOICE_OF_PHRASES, C.
For another example, figure 25.3 shows what is built upon the execution of: <A> <B> <C> - <D> <E>
Figure 25.4 shows: <A> - <B> <C> - <D> - <E> <F> - <G> Now suppose the procedures X and Y are defined by: DEFINE X: <A> ENDDEFN DEFINE Y: <B> - <C> ENDDEFN The STATEMENT: X; Y; generates an ambiguous phrase consisting of the phrase "<A>" and the phrase "<B><C>". In contrast: X; - Y; generates one phrase, "<A><B><C>". (The semicolons are part of the atomic STATEMENTs, as: X; is the notation for calling procedure X). In general, the normal ~sequential execution of phrase generations "<...>" offers up the phrases as ambiguous, multiple interpretations sharing the same span (LEFT and C). We will see examples of phrase generation from within the semantics of syntax rules in next chapter. BOX: What kind of phrase is generated by a sequence of BOX: STATEMENTs without the dashes? BOX: BOX: Can phrase generation be packaged in procedures? 25.5 Two Kinds Of Ambiguity At Once - The Passage Of Ambiguity From One Domain To The Next Ambiguities can arise from the syntax grammar and also from the datatype grammar. Both kinds of ambiguity manifest themselves in the same way. They result in ambiguous datatype phrases. 25.5.1 Datatype Ambiguities 25.5.1.1 Ambiguities From Coercion Let's first examine ambiguities that arise from the datatype grammar alone. For example, the phrase: 1 is turned into the datatype phrase: ~INT The coercion from INT to REAL: ~INT -> ~REAL
applied to the datatype phrase gives rise to the leftmost two blocks in figure 25.5. Here we have an ambiguity between INT and REAL: The "1" is either an INT or REAL. 25.5.1.2 Ambiguities From Polymorphism Ambiguities may arise from polymorphism as well as coercion. We just saw one introduced by a coercion. Here is an example that involves "+"'s polymorphism. Assume we have only the following rules: ~INT ~INT + -> ~INT ~REAL ~REAL + -> ~REAL ~INT -> ~REAL The phrase: 1 + 2 which gives rise to the datatype phrase: ~INT ~INT + can be interpreted as a REAL via ~two means.
Consider first figure 25.6(a). Each of the two INTs is independently turned into a REAL, and then the rule: ~REAL ~REAL + -> ~REAL applies to give an overall REAL interpretation. This is called the ~two-coercion interpretation, because a coercion is applied to each INT. Another scenario also occurs. "+" is polymorphic because it will combine INTs or it will combine REALs. Our second scenario, shown in figure 25.6(b), has "+" combining the INTs to produce an INT. That ~one combined INT is then turned into a REAL. This is the ~one- coercion interpretation.
Both these scenarios happen at once, giving rise to figure 25.7. The full-spanning REAL has two meanings as just discussed. (Figure 28.2 may illustrate these two meanings more clearly). Later on, when we invoke the semantics of the full-spanning REAL, both meanings will be considered. A process described in Chapter 28 will choose ultimately the one-coercion solution over the two- coercion solution. This concludes our examples of ambiguities within the datatype language alone. 25.5.2 What Happens To Syntactic Ambiguities? Recall from Section 15.2 that syntactic ambiguities manifest themselves as ambiguous semantics. For example, the syntactically ambiguous phrase: A + B # C
delivers ambiguous semantics, as shown in figure 25.8(a). Consider the leftmost "+" block in the figure. It is the "+" in "A+B". As figure 25.8(b) shows, that semantic block generates the reverse polish phrase: A B + where A and B (shown surrounded by circles) denote the ~types of the variables A and B. Figure 25.8(b) shows with each semantic block the reverse polish phrase it generates. For example, the "(A+B)#C" block generates: A B + C # by invoking its sub-block (A+B)'s semantics, which generate the "A B +". It then invokes its second sub-block (C), which generates the second part of the phrase, the C. Finally, it appends the "#" to complete the reverse polish phrase. Similarly, the other choice, "A+(B#C)", generates the phrase: A B C # + ~Our ~syntactic ~ambiguity ~gives ~rise ~to ~the ~ambiguous ~datatype ~phrase: A B + C # ~or A B C # + Let's go on to see how this ambiguous phrase parses in the datatype grammar. 25.5.3 Parsing By The Datatype Grammar
Figure 25.9(a) shows the ambiguous phrase if A, B, and C are all of type REAL. Part (b) shows what is generated as a by-product of the grammar. The datatype rules relevant to this example are: 1) ~REAL ~REAL + -> ~REAL 2) ~POINT ~POINT + -> ~POINT 3) ~REAL ~REAL # -> ~POINT ("+" works on POINTs as well as REALs, and "#" combines two REALs to form a POINT). The two phrases making up the ambiguous phrase may or may not individually parse successfully. For example, suppose the types of A, B, and C are all REALs. The two phrases appear as: ~REAL ~REAL + ~REAL # and ~REAL ~REAL ~REAL # + The first phrase reduces to the following, by rule #1: ~REAL ~REAL # Rule #3 reduces this successfully to a full-spanning: ~POINT In contrast, our second phrase can be partially parsed, via rule #3, to: ~REAL ~POINT + This final form can be reduced no further. No full-spanning type can be acquired from this phrase with our rules (barring other coercions). We've just seen how a syntax ambiguity generates multiple datatype phrases, some of which might die, being unable to be rendered as a full-spanning type. Neither, either, or both phrases may succeed. We've seen one phrase succeed and the other fail. BOX: If A were of type POINT, what phrase(s) would BOX: survive? If the types of A, B, and C were as follows: A is POINT B and C are REALs then the ambiguous phrase would be: ~POINT ~REAL + ~REAL # ~or ~POINT ~REAL ~REAL # + The latter phrase would be the only survivor (as the "REAL REAL #" goes to POINT). If A, B, and C were all POINTs, neither phrase would survive. If A were itself ambiguous, either a REAL or a POINT, while B and C were still REALs, ~both phrases would survive. 25.5.4 Under Syntax, What Does The Semantic ~OR Block Do? In Section 15.2 we characterized the action performed by the ~OR block as being simply unspecified, a function by the name of F. We are now in a position to choose a definition for F. We want the ~OR block to generate an ambiguous phrase, the ambiguous combination of the phrases generated by each of OR's two components. We've seen already how to generate ambiguous phrases: Just generate one phrase and then generate the other phrase. This is two STATEMENTs executed sequentially, with no dash connecting them. Thus, we define F via: DEFINE F( A,B: BASIC_PROCESS ): <*A*>; <*B*>; ENDDEFN F invokes each of its given components, A, and B, so that each contributes a phrase, where both phrases share the same span (LEFT and C). In figure 25.8(b), the OR-block invokes each of its constituents, and thus generates the ambiguous phrase: A B + C # ~or A B C # + Figure 25.9 shows the generated phrase, if A, B, and C are REALs. Thus, both syntactic and datatype ambiguities come together in one formalism. Syntactic ambiguities generate ambiguous datatype phrases, as also does the datatype grammar on its own. All ambiguities ultimately reside in the datatype grammar. BOX: How can ambiguities occur by the datatype grammar BOX: alone? BOX: BOX: How does a syntactic ambiguity manifest itself in the BOX: datatype language? BOX: BOX: What does the OR-block do during the execution of BOX: syntax's semantics? 25.6 Picking Up The Generated Phrases As A CHOICE_OF_PHRASES We generate phrases via STATEMENTs, as just shown. Once phrases are generated, C holds all the phrases as a single CHOICE_OF_PHRASES. We introduce two notations that turn ~any STATEMENT into a CHOICE_OF_PHRASES. One of them is: [-> STATEMENT <-] -> EXPR (a CHOICE_OF_PHRASES) This basically executes the STATEMENT and then yields the value (the generated phrases in C). This notation translates simply to: HOLDING C:= NIL; GIVE DO ~the_STATEMENT GIVE C ENDHOLD The STATEMENT is executed in a context where C is set to NIL initially. All phrases generated, which wind up on C, is the value delivered by this EXPR. (If the STATEMENT performs no phrase generation, then this EXPR yields NIL). This new EXPR depends on the variable LEFT. Whatever happens to be in LEFT upon commencement of this EXPR's execution, will be the left endpoint for all full-spanning phrases. Our second notation will be the one we use here almost exclusively: [> STATEMENT <] -> EXPR This executes the STATEMENT, and yields ~almost all of C as its result. It yields only the ~full-spanning phrases in C that are of ~length ~one. Its translation follows: HOLDING C:= NIL; LEFT:= NIL; GIVE DO ~the_STATEMENT GIVE C pruned so as to hold full-spanning unit- length phrases only ENDHOLD Example: The STATEMENT: <INT> generates a CHOICE_OF_PHRASES (on C) that includes at least an <INT>. If the coercion from INT to REAL is a rule in our datatype grammar, then C will also contain a <REAL>, sharing the same span as the <INT>. Let's enclose this STATEMENT within the "[>...<]": [> <INT> <] This EXPR is of type CHOICE_OF_PHRASES, and the value of this EXPR is the CHOICE_OF_PHRASES that arises on C upon generating the phrase <INT>. That is, this EXPR is a CHOICE_OF_PHRASES containing both <INT> and <REAL>. Example: While the STATEMENT: <A> <B> <C> - <D> <E> ~generates the ambiguous phrase in figure 25.3, the EXPR: [-> <A> <B> <C> - <D> <E> <-] delivers that generated phrase as a CHOICE_OF_PHRASES, with no mention made of our variable C. Our more popular operation with "[>" instead of "[->": [> <A> <B> <C> - <D> <E> <] delivers only the full-spanning phrases A, B, and E. (Conceivably, the "<C>-<D>" phrase, with a grammar active, could produce other full-spanning phrases, which would then be included). Example: Given the rules: ~INT ~INT plus -> ~INT ~REAL ~REAL plus -> ~REAL ~INT -> ~REAL the CHOICE_OF_PHRASES delivered by: [-> <INT> - <INT> - <plus> <-] is shown in figure 25.7. The one delivered by: [> <INT> - <INT> - <plus> <] consists of only two phrases, the full-spanning INT and REAL. BOX: BOX: What notations do we have that collect up the phrases BOX: generated by the execution of a STATEMENT? BOX: BOX: The expression: BOX: BOX: [> <INT> <] BOX: BOX: represents what CHOICE_OF_PHRASES, assuming we have BOX: the INT-to-REAL coercion? BOX: BOX: Because: BOX: BOX: <INT> BOX: BOX: is a STATEMENT, what effect does it have? 25.7 The WANT Quantifier We now address the last notation required in order to present a language involving two domains, in our case, syntax and datatypes. Suppose we have a CHOICE_OF_PHRASES and we want to find, say, a full- spanning BOOLean. This need may arise in the implementation of the syntax rule: while EXPR ; -> QUANTIFIER For this quantifier to be meaningful, the EXPR must be of type BOOL. We can invoke that EXPR's semantics so as to generate all possible types (phrases) that the EXPR could be interpreted as. Then, using the notation "[>...<]", we obtain a CHOICE_OF_PHRASES that has all the full-spanning types of that EXPR. We now are interested in that full-spanning BOOL within the CHOICE_OF_PHRASES. Suppose X is the CHOICE_OF_PHRASES. The ~new quantifier: WANT <BOOL:B> FROM X ; causes an iteration for each full-spanning BOOL in X. On each of those iterations, it sets B to the semantics of the matched BOOL. (According to Section 14.3, there can be at most one full- spanning BOOL, as duplicate phrases are supressed. This instance of the WANT quantifier thus causes always zero or one iterations. It causes zero iterations if there is no full- spanning BOOL, and one iteration otherwise). This WANT quantifier translates into our more familiar FOR-quantifier: FOR P $E X; WITH P.POS = BOOL & -DEFINED( P.LEFT ) ; EACH_DO b:= P.SEM ; ; It looks at all phrase blocks (P) in the CHOICE_OF_PHRASES X, and causes an iteration for each phrase block whose part-of-speech is BOOL, and which is full-spanning (the "-defined(P.LEFT)"). It also sets the matched phrase block's semantics into the variable B (the EACH_DO). (The "P.SEM" here may be transformed to "<*P.SEM*>" if context demands that). The WANT quantifier is supported by the following syntax rule: want TOKEN from EXPR ; -> QUANTIFER The TOKEN is just like that admitted on the lefthand side of a rule. The EXPR must be of type CHOICE_OF_PHRASES. 25.7.1 WANT Quantifier Can Involve Array Parts-Of-Speech Recall that a TOKEN can be of the form: < ID [ ID ] : ID > as in: <EXPR[I]:X> or <TYPES[T1]:X> That is, for matching array parts-of-speech, the index variable (the ID enclosed in "[...]") is also set upon each match. For example, let's declare an array part-of-speech we will be using for datatype processing: POS TYPES[10000] : - ; We will agree that any datatype is an instance of this TYPES array of parts-of-speech. Therefore, the quantifier: WANT <TYPES[T]:S> FROM X ; may cause more than one iteration. Any part-of-speech that is a member of the TYPES array of parts-of-speech will be matched, and T will be set to its index, a different index for each match. We will see examples of this use of the WANT quantifier in the next chapter. 25.7.2 WANT Can Match A Phrase In general, the WANT quantifier can specify not only one TOKEN to match, but a left-to-right sequence of TOKENs to be matched: want TOKEN TOKEN ... TOKEN from EXPR ; -> QUANTIFIER For example, consider: WANT <REAL:a> <INT:b> <plus> FROM X ; This will cause an iteration for each full-spanning occurence of the phrase: <REAL> <INT> <plus> There will be one match if X is figure 25.7. 25.7.3 WANT Can Recognize Non-Full-Spanning Phrases Finally, the WANT quantifier can be used even to find non-full-spanning phrases. The left side of the phrase to match may be augmented via the syntax: ID -- as in: WANT L-- <REAL:a> <INT:b> <plus> FROM X ; Now all occurences of the phrase: <REAL> <INT> <plus> emenating from X will be matched. Upon each match, the new variable L is set to the left neightbor of the matched phrase. Now that we can capture the left neighbor of a matched phrase, we can note an equivalence. For example, the quantifier: WANT <REAL:a> <INT:b> <plus> FROM X ; can be written equivalently as: WANT L1-- <plus> FROM X ; !! WANT L2-- <INT:b> FROM L1 ; !! WANT <plus> FROM L2 ; Each of these three quantifiers matches one token of the original match phrase. Each token is not required to be full-spanning, as the notation: ID -- occurs in the first two quantifiers. Notice how the next quantifier reads the left neighbor set by the previous quantifer (e.g., L1 and L2). BOX: Of what use is the WANT quantifier? BOX: BOX: How do you specify a WANT quantifier that matches BOX: phrases of lengths greater than one? BOX: BOX: How do you get WANT to match non-full-spanning phrases?