CHAPTER 29


				   SEMANTICS EXPRESSED DIRECTLY

					  IN THE NEW LANGUAGE



	  Often it is easiest to augment a language by introducing new rules
	  whose semantics are specified in the new language itself.	 For example,
	  suppose we wanted to introduce a new rule:

				EXPR	squared		->	  EXPR

	  (ICL doesn't laready support this rule.  In ICL, you'd have to
	  introduce a "\" in front of "squared").

	  This new syntax rule is easiest to specify via something like:

		    <EXPR:E>  SQUARED	    ->	EXPR
								 by
								E*E

	  Here, the semantics, "E*E", is specified in the new language.
	  Presumably, multiplication is already part of the new language.

	  We call this the "by" notation for rules.  (The use of this method
	  requires that a portion of the language already be implemented, by
	  means shown earlier).

	  Contrast this to the following specification, which is of the sort
	  we've been using:

		<EXPR:E>  SQUARED	->

		   <EXPR:
			WANT  <INT:I>  FROM  [>  <*E*>;  <] ;
			DO
				<INT:
				     RESIST =>  I
				     USUAL  =>  "Get E into register 1..."
						<*I*> ;
						MUL(1,1);  "Leave the square in
							    register 1"
				>
			END
		   >

	This complex specification first insists that the EXPR E can be seen
	as an INT (the WANT quantifier).  It then produces an INT whose
	resistance is that of E (I), and which generates machine language code
	that leaves the answer in register 1.

	The two specifications for this rule produce slightly different
	behaviors.  The latter, long specification defines SQUARED only for the
	type INT.  It is not defined for REALs for example.  In contrast, the
	former brief specification works for REALs as well.  The EXPR before
	the SQUARED can be any type for which multiplication is defined.


29.1
Another Example

	Another, more complex example of specifying semantics in the new
	language is the following rule that defines the WANT quantifier
	(Section 25.7):

		WANT  '<'  <ID:pos>  ':'  <ID:sem>  '>'  FROM  <EXPR:from>  ';'

		->  QUANTIFIER
			by

			FOR X $E ~from;  WITH  X.POS = ~pos ;
					 EACH_DO  ~sem := X.SEM ; ;

	We have rendered in lower case the variables ~from, ~pos, and ~sem,
	the semantic variables specified on the first line.  They correspond
	to the variable E in our previous example.

		BOX:	What are advantages of specifying semantics in the new
		BOX:	language itself?


29.2
Implementation Of Rules With Semantics Specified In The New Language

	Upon introducing this new rule, take the semantic specification
	(following the word ~by) and parse it with the syntax grammar.  Choose
	the full-spanning interpretation whose part-of-speech matches that
	given to the right of the "->".

	In our first example, we look for an EXPR interpretation over the
	semantic specification "E*E".  In our latter example, we look for a
	QUANTIFIER interpretation for the semantic specification following
	the "by".

	For this parsing thru the syntax grammar, we treat occurences of the
	semantic variables specially.  The "E" in "E*E" is ~not treated as an
	IDentifier, which it normally would be.  This "E" denotes an <EXPR>
	directly because "E" appears on the first line as a semantic variable
	of an <EXPR>.

	That is, each appearence of a semantic variable in the semantic
	specification is replaced by the part-of-speech to which that variable
	is associated in the first line.

	Thus, the semantic specification:

		E*E

	generates for parsing the string:

		<EXPR> * <EXPR>

	Similarly, in our QUANTIFIER example, the string parsed is:

		FOR X $E <EXPR> ;  WITH  X.POS = <ID> ;
				   EACH_DO  <ID> := X.SEM ; ;

	Each semantic variable has been substituted by the part-of-speech it
	represents.


29.2.1
Parameterized Semantic Structures

	

	Once the parsing of the semantic specification is completed, take hold
	of the resulting semantic structure.  Figure 29.1(a) shows the semantic
	structure for "E*E".

	In general, the semantic structure can be arbitrarily large.
	It is distinguished by having ~some of its leaves being the ~semantic
	~variables (shown as "!"s in figure 29.1(c)).

	We have a ~parameterized semantic structure.  The parameters
	are the semantic variables.

	We call such a parameterized semantic structure the ~associated
	~semantic ~structure with this rule.


29.2.2
Where Do The Parameters Come From?

	The actual parameters for the semantic structure come from the new
	rule's lefthand side.

	  Consider our first example rule:

		    <EXPR:E>  SQUARED	    ->	EXPR
								 by
								 E*E

	  The E on the lefthand side is the one parameter.  This rule has an
	  associated semantic structure based on E, "E*E".

	  Similarly, the lefthand side of our second rule:

		    WANT  '<'  <ID:pos>	 ':'	<ID:sem>  '>'  FROM  <EXPR:from>  ';'

		    ->	QUANTIFIER
				   by
				   FOR X $E ~from ;  WITH  X.POS = ~pos ;
							   EACH_DO	~sem := X.SEM ; ;

	  supplies the parameters ~pos, ~sem, and ~from.  The associated
	  semantic structure depends on those variables.


29.2.3
How The New Rule Applies

	  The new rule applies as usual by first having the lefthand phrase
	  match some visible phrase.	That match puts semantic values
	  into those semantic variables.

	  The semantics of the righthand side part-of-speech are computed by
	  ~copying the associated semantic structure, replacing each occurence
	  of a semantic variable with the ~value in that semantic variable.

	

	  Figure 29.2(a) shows the associated semantic structure
	  with our SQUARED rule.  Part (b) shows a copy of it where "E" is
	  replaced by the semantics of "5", which would occur upon seeing:

		    5	 SQUARED

	  Part (c) shows the copy formed upon seeing:

		    X*Y  SQUARED

	  Part (d) shows the copies with the parameters plugged in for:

		    5	 SQUARED  +	 ( X*Y  SQUARED )

	  Part (e) shows:

		    ( 5  SQUARED	+  X*Y )  SQUARED


	    BOX:  BOX:	What is the ~associated ~semantic ~structure of a rule
		    BOX:	specified with the "by" notation?
		    BOX:
		    BOX:	What are the parameters of the associated semantic
		    BOX:	structure?
		    BOX:
		    BOX:	When are those parameters acquired?
		    BOX:
		    BOX:	When parsing the semantic specification (following the
		    BOX:	word "by"), each semantic variable denotes what part-
		    BOX:	of-speech?


29.2.4
Delaying The Copy Operation Until After Parsing

	  As things stand now, the associated semantic structure is copied
	  when the rule applies.  That copy expense can become a significant
	  waste of time if ~most applications of the rule fail to participate
	  ultimately in an understanding (derivation) of the string being parsed.
	  We discovered in Section 2.3.1 that delaying semantic evaluation until
	  after parsing is complete can dramatically speed up the parsing job.

	  Rather than copying the associated semantic structure when the rule
	  applies, we can copy it after the parsing is done.	Consider the
	  following rendition:

		    <EXPR:E>  SQUARED	    ->

			  <EXPR:
				    @(SELF_) := a copy of the associated semantic
						    structure, plugging in E ;

				    <* SELF_ *> ;
			  >

	

	  When this rule applies, it builds the semantic structure shown in
	  figure 29.3.  This semantic structure simply references the associated
	  semantic structure along with its parameters (e.g., E).  It takes
	  very little time to do this, as we need construct only the upper-
	  leftmost block.

	  After syntax parsing is done, the ultimate invocation of this semantic
	  structure performs our new program, shown within the <EXPR:...>.  That
	  program forms a new copy of the associated semantics with the
	  parameter(s) plugged in.  It then modifies objectively ("@") the
	  original semantic structure (SELF_) by writing over it with the newly
	  formed copy (figure 29.3(c)).  The result is our new copy now residing
	  where the original semantic structure lived.

	  Having finally modified our semantic structure so as to be the new
	  copy with parameters plugged in, we invoke it, so as to actually
	  perform our semantics.  Subsequent invocations of the semantic
	  structure will find our newly formed copy, and so will then immediately
	  invoke the copy, entirely unaware that a copy was ever made.


29.3
Syntactic Definitions Are Different From General Rewrites

	  What is the difference between the two rules:

		    <EXPR:E>  SQUARED	    ->	EXPR
								 by
								 E*E
	  and
		    <EXPR:E>  SQUARED	    ->	<EXPR:E> - '*' - <EXPR:E>

	  The first rule is a context-free rule that rewrites to <EXPR> directly.
	  The second rule is a general rewrite rule that rewrites to
	  "<EXPR> * <EXPR>", which is then rewritten during parsing to a single
	  <EXPR>.

	  The former rule is more efficient because the parsing of "E*E" has
	  occured already, prior to commencing syntax parsing.  (That parsing
	  occured only once, when we introduced this rule).  The latter rule,
	  each time it applies, produces the "<EXPR> * <EXPR>" which each time
	  will be parsed into an <EXPR>.

	  Besides efficiency, the two rules can produce different results.
	  For example, consider the expression:

		    5	 SQUARED  ^	 13

	  (The "^" is exponentiation, which binds before multiplications ("*")).

	  With the first rule, this expression is interpreted as:

		    (5*5) ^ 13

	  But the second rule comes up with the interpretation:

		    5 * (5 ^ 13)


	  That latter, general rewrite rule offers up the phrase:

		    EXPR * EXPR

	  which together with the "^13" looks like:

		    EXPR * EXPR ^ EXPR

	  The two EXPRs surrounding the "*" can react with other phrases, like
	  the "^EXPR".  Because the "^" binds before "*", we get the overall
	  interpretation:

		    EXPR * (EXPR ^ EXPR)


	  In contrast, the former rule never offers up the phrase:

		    EXPR * EXPR

	  It offers only:

		    EXPR

	  which ~inside has the "*".	The multiplication is thus not available
	  for consideration within the syntax parsing.	The "^" cannot separate
	  the two EXPRs combined by the "*".

	  The former rule, the context-free rule, probably implements SQUARED
	  the expected way.  The meaning of SQUARED keeps its integrity and
	  can't be broken apart with the context-free rule.  However, in some
	applications, we might really want the behavior caused by the general
	rewrite rule.  There is a choice.

		BOX:	How do general rewrite rules behave differently from
		BOX:	rules specified via the "by" notation?