CHAPTER 13


				THE PARSER ALWAYS WORKS



	  This entire chapter may be skipped.  It resorts to mathematical logic
	  to prove that this parser works.	It may lend clarity to the parser's
	properties and operation.

	The following arguments are meant to apply to general rewrite grammars
	as well as context-free grammars.  You might choose, however to think
	of a give-phrase as having length one.

	Let's prove that if the parser terminates, then it works.  It works
	  if the following is true:

		    If the input string is a member of the grammar's language,
		then the grammar's root part-of-speech will exist, spanning
		    the whole input string.

		    Also the opposite:	    If the input string is ~not a member
						    of the language, then there will be no
						    full-spanning occurence of the root
						    part-of-speech.

	  The appearence of the root part-of-speech can directly commence
	  semantic processing, as discussed in the Section 12.2.1.


13.1
What It Means For A CHOICE_OF_PHRASES To Be ~Fully ~Ambiguated

	  We introduce now a formalism from which we can prove that the parser
	  works.

	  A CHOICE_OF_PHRASES is ~fully ~ambiguated when all possible rules have
	  been applied to all visible phrases throughout the CHOICE_OF_PHRASES.
	  That is, all rules have applied everywhere possible in the
	  CHOICE_OF_PHRASES.

	  Let's restate this definition more specifically, in two parts:

		A CHOICE_OF_PHRASES, C, is ~fully ~ambiguated when both of
		the following are true:

		   a)	for every visible phrase emenating from C:

			    If that phrase matches some rule's want-phrase,
				    then also in C is the rule's give-phrase.

			    The want- and the give-phrases have identical
			    LEFTs (spans).

	  pic

			Figure 13.1 shows a want-phrase together with a give-
			phrase that have identical spans.

		and
		   b)	The LEFT fields in all PHRASE blocks in C are
			themselves ~fully ~ambiguated.

	For example, the righthand end of figure 12.6 shows a fully ambiguated
	CHOICE_OF_PHRASES, C.  The visible phrase

		<FORM> + <TERM>

	emenates from C.  The variables P3,P2,P1 point out this visible phrase.
	This visible phrase is an occurence of the want-phrase of the rule:

		<FORM> + <TERM>	     ->	     <FORM>

	C is fully ambiguated because this rule's give-phrase, <FORM>, also
	  emenates from C.  Both of the phrases have identical LEFTs.  Both
	  point to the left end of the figure.

	  Consider the following scenario.	Let C be a CHOICE_OF_PHRASES which is
	  fully ambiguated:

	     1)   A friend picks ~any visible phrase anywhere in C.

				The visible phrase does ~not have to emenate from C, it
				could be any visible phrase emenating from any
				CHOICE_OF_PHRASE ~accessible from C's point-of-view.

	  pic

			Figure 13.2 shows a visible phrase somewhere throughout
			C.  Your friend's phrase is labelled in the figure as
				the "want-phrase".

	     2)   You pick the CHOICE_OF_PHRASES from which that
		    visible phrase emenates.	Call it RIGHTHAND.

	     3)   RIGHTHAND is fully ambiguated, because:

				Since RIGHTHAND is accessible from C, we can get from
				C to RIGHTHAND by taking LEFT pointers, from one
				CHOICE_OF_PHRASES to the left neighboring
				CHOICE_OF_PHRASES, repeatedly.  We start at C.

				Since C is fully ambiguated, part B of the definition
				assures that each of its PHRASE block's LEFT fields
			are themselves also fully ambiguated.

			Each time we take a step leftwards, we are leaving one
			fully ambiguated CHOICE_OF_PHRASES, and entering
			another fully ambiguated CHOICE_OF_PHRASES.  Thus,
			when we reach RIGHTHAND, we have deduced that
			RIGHTHAND, like all CHOICE_OF_PHRASES encountered, is
			fully ambiguated.

	   4)	If your friend's chosen visible phrase happens to match some
		    rule's want-phrase, then the same rule's ~give-phrase ~also
		    ~appears ~in ~RIGHTHAND, by part A of the definition, applied
		    to RIGHTHAND.	 The give-phrase and the want-phrase share the
		    same span.

	  All this means is that if you see a rule's want-phrase ~anywhere
	throughout a fully ambiguated CHOICE_OF_PHRASES, that rule's give-
	  phrase also appears, sharing the same span.

	  ~Sharing ~the ~same ~span means:

		    a)	Both phrases emenate from the same CHOICE_OF_PHRASES,
	  and
		    b)	The leftmost PHRASE block in each phrase have identical
				values in their LEFT fields.

	  Figure 13.1(a) shows a want- and a give-phrase having the same span in
	  a visible phrase.  Figure 13.1(b) shows the same for a general
	  rewrite rule.  The give-phrase has length greater than one.


13.2
Full-Spanning Visible Phrases

	  Imagine the input string.  Imagine now a sequence of full-spanning
	  strings, the first of which is the input string.  Each string in the
	  sequence is identical to the previous string, except that one rule has
	  been applied.  Some occurence of this rule's want-phrase in the
	previous string has been replaced now by an occurence of the rule's
	  give-phrase.

	  Suppose a string in that sequence is a full-spanning visible phrase
	  emenating from C.  If C is fully ambiguated, then the next string in
	  the sequence also appears as a full-spanning visible phrase, emenating
	  from C.  Why?

		    The first string, a visible phrase, is a leftward going
		    sequence of PHRASE blocks.  Somewhere in that sequence is a
		    sub-sequence that represents the occurence of the want-phrase,
		    shown in figure 13.2.

		    Take the whole sequence of PHRASE blocks and form a new
		    sequence by abandoning the sub-sequence that represents the
		    occurence of the rule's want-phrase.  Take instead the
		guaranteed occurence of the give-phrase, whose span matches
		that of the want-phrase (figure 13.1(a)).

		Because the spans are identical, we know that the rightmost
		PHRASE block from each of the want-phrase and give-phrase
		reside in the same CHOICE_OF_PHRASES.  This is assured by
		part A of the definition of "fully ambiguated".  Our new
		sequence thus chooses the give-phrases's rightmost PHRASE block
		    instead of the want-phrase's rightmost PHRASE block.

		We substitute the want-phrase sub-sequence by the give-phrase
		sequence.  Since both the want- and give-phrases share the same
		LEFT, both phrases meet towards the left.  No matter which
		of the phrases you travel left from, the leftward sequence is
		identical.  Thus, this new sequence of PHRASE blocks, which
		contains the give-phrase, is itself a visible phrase.  The new
		sequence (like the original) obeys the property that each
		PHRASE block is taken from the LEFT of a previous (right
		neighboring) PHRASE block.

		The new string, a sequence of PHRASE blocks, represents the
		next string in the envisioned sequence of rewritten, full-
		spanning strings.

	Thus, if C is fully ambiguated, and if the input string is a visible
	phrase in C, then all strings in the envisioned sequence of strings are
	visible phrases in C.
  ------	The previous three lines are to be italics	------

	If the input string is a member of the grammar's
	  language, then you can envision a sequence of rewritten strings which
	  ends with the grammar's root part-of-speech, sharing the same span with
	the input string.  That sequence too is represented in C, as is any
	imagined sequence of rewrites, starting from the input string.

	We will show that GIVE preserves the property of being fully
	ambiguated.  First, we assume this behavior of GIVE and show
	completeness.


13.3
Completeness

	Our method for taking user input has the following properties:

	   1)	The input string is a full-spanning visible phrase in the
		rightmost CHOICE_OF_PHRASES, the final value in C.

			(See Section 12.4).

	   2)	Because we set C to NIL before calling GIVE, C is always
		fully ambiguated.

		   a)	C upon entry to GIVE is fully ambiguated.

				(A NIL CHOICE_OF_PHRASES is fully ambiguated
				because it has no phrases.  No rule's want-
					  phrase appears, and so it is vacuously true
					  that every want-phrase has a give-phrase).

			 b)	C after GIVE is fully ambiguated, because GIVE
				preserves the property of being fully ambiguated
				(Section 13.3.1).

	     3)   In the end, C is fully ambiguated.

	     4)   If the given input string is a member of the language defined
		    by the grammar, then in the end, C has at least two full-
		    spanning visible phrases:	 the input string and a full-spanning
		    occurence of the grammar's root part-of-speech.

			(This was just proven in the previous section).

	This shows that the parser works at least when given a legal string,
	a string that is a member of the language.  Both the input string
	and a full-spanning occurence of the root part-of-speech exist as
	visible phrases.

	We have relied on the following.


13.3.1
GIVE Preserves The Fully Ambiguated Property

	Suppose C is presently fully ambiguated.  A call to GIVE will leave C
	still fully ambiguated.

	Let C0 denote C prior to the call to GIVE.  Let C1 be C after the call.
	Let's prove part A of the definition of "fully ambiguated":

	     1)   Pick any visible phrase emenating from C1.	Suppose that it
		    matches a rule's want-phrase.

	   2a)		If the chosen phrase is also in C0, then because C0 is
			fully ambiguated (by assumption), the rule's give-
				phrase already exists in C0.	It shares the same span
				with the want-phrase.  Because GIVE never removes
				elements from C, that give-phrase is also in C1.

	     2b)  If the chosen phrase is not in C0, then we know that its
		    rightmost PHRASE block was introduced into C by a call to GIVE.
		    (GIVE is the only place that C is affected).

	     3)   GIVE called the grammar at that time.	 At that time, the chosen
		    phrase was a visible phrase, starting from that rightmost
		    PHRASE block.

				We know that ~now the chosen want-phrase is a visible
				phrase starting from that rightmost PHRASE block.
				Our consistent use of subjective
				modification means that we never affect existing data.
				Thus, the PHRASE block's LEFT has remained unchanged
			since that PHRASE block was created.  Thus, what we
			see now starting from that PHRASE block was also
			visible when the PHRASE block was created.

	   4)	Since the chosen want-phrase was visible when the grammar was
		called, the rule's (outermost) IF-statement matched the new
		    PHRASE block.	 The complex quantifier inside the THEN-clause
		    matched the rest of the chosen visible phrase (among others).

				One of the iterations of that quantifier set P(1),
				P(2), ... P(n) to the chosen occurence of the want-
				phrase.
	---------------- Parentheses in previous paragraph mean subscripting! ---

				(We deduce that this iteration exists because
				the chosen want-phrase,

					  PHRASE(k+l) ... PHRASE(k+1) PHRASE(k)

				obeys the property that

					  PHRASE(i)	 is a member of PHRASE(i-1).LEFT

				since it is a visible phrase.	 Each dimension
				of the quantifier looks at ~all PHRASE(i)s
				emenating from PHRASE(i-1).LEFT, and so it will
				see ours).
	---------------- Parentheses in previous paragraph mean subscripting! ---

	     5)   Upon matching the chosen want-phrase, the rule then applied.
		    It called GIVE with the give-phrase.	Both the want-phrase and
		    give-phrase share the same span, because:

			 a)	Both phrases emenate from C:

					  When we call GIVE, it puts the new give-phrase
					  into C.  C thus contains (the rightmost block
					  of) the give-phrase.

					  Now we show that at that time, C contained
					  also the rightmost block of the want-phrase.
					  Because of pure subjective modification,
					  what appeared then still appears now.

					  Note that we, the grammar, got called
					  from GIVE.  GIVE had just been given a PHRASE
					  block.  ~That ~PHRASE ~is ~the ~rightmost
					  ~element ~in ~our ~want-phrase.

						    GIVE sees only visible phrases that
						    have as their rightmost PHRASE block,
						    the block passed into GIVE.  Since our
						    rule is applying upon that want-phrase,
						    GIVE was passed our want-phrase's
					rightmost PHRASE block.

				Prior to calling us (the grammar),  GIVE had
				just put that PHRASE block onto C.  Now both
				the want- and give-phrases emenate from C.
				(Their rightmost PHRASE blocks are
				members of C).

		   b)	The LEFT of both phrases are identical, because the
			PHRASE passed into GIVE has as its LEFT the LEFT of the
			matched phrase, P(1).LEFT, the LEFT field of the
			want-phrase's leftmost PHRASE block.
	---------------- Parentheses in previous paragraph mean subscripting! ---

	     6)   Since the chosen want-phrase and the new give-phrase share the
		    same span, part A of the definition is proven.


13.3.2
Part B Of The Definition

	  Let's prove part B of the definition of "fully ambiguated".  We will
	proceed by assuming that part B of the definition is false.  We will
	then encounter a contradiction.

	Consider the LEFT field of each new PHRASE block passed to GIVE.
	If all the PHRASE blocks passed to GIVE have fully ambiguated LEFTs,
	then all elements on C (put there by GIVE) all have fully ambiguated
	LEFTs.  This means part B of the definition is true for C.

	Playing the devil's advocate, consider now the first call to GIVE
	  whose LEFT field is not fully ambiguated.  By assumption, all previous
	  calls to GIVE got LEFTs which were fully ambiguated.  Since C contains
	  only PHRASE blocks passed to GIVE, all LEFTs in C are fully ambiguated,
	  up to now.  Thus, the value in C after each of those earlier calls was
	  fully ambiguated as per both parts of the definition.  (We proved part
	  A earlier, and we can assume part B for all previous calls to GIVE).

	  All previous calls to GIVE left C fully ambiguated.	 Our first call
	  to GIVE with a non-fully ambiguated LEFT can occur in any one of three
	  places:

	     1)   When taking user input:

				LEFT is set to NIL initially.	 Thus, the first call to
				GIVE receives a fully ambiguated LEFT.

				Each subsequent call to GIVE delivers a LEFT field
				which is taken from C, i.e.,

					  LEFT:= C ;

					  C:= NIL ;
					  GIVE(  [LEFT:LEFT  POS:<pos>]  );

				That value in C was the result of the previous (left
				neighboring) call to GIVE.

				By assumption, every LEFT passed to GIVE earlier, and
				hence all LEFT fields in C, are fully ambiguated.
				Part B of the definition thus holds for C.  Combined
				with part A, we know that C is fully ambiguated.

				Thus the LEFT just passed to GIVE, which came from C,
				must be fully ambiguated.

				This contradicts our choice that this GIVE got a non-
				fully ambiguated LEFT.

	     2)   For context-free rules:

				A context-free rule calls GIVE with a PHRASE block
				whose LEFT is taken from the want-phrase.	 The want-
				phrase was produced by earlier GIVEs.  By assumption,
				those earlier GIVEs all received PHRASE blocks with
				fully ambiguated LEFTs.

				Our new PHRASE block, the one now being passed to
				GIVE, has as its LEFT one of those fully ambiguated
				LEFTs.

				This contradicts our choice that this GIVE got a non-
				fully ambiguated LEFT.

	     3)   For general rewrite rules:

				Notice that the LEFT field for each GIVE is taken
				from either:

				   a)	  An existing LEFT  (from the leftmost PHRASE
								   block in the want-phrase).
				or
				   b)	  A value in C following a call to GIVE.

				The argument for context-free rules shows that (a), an
				existing LEFT, is indeed fully ambiguated.  Thus (a)
				is not the source for a non-fully ambiguated LEFT.

				The argument for taking user input shows that the
				the value in LEFT passed to GIVE, upon executing:

					  LEFT:= C;

					  C:= NIL;
					  GIVE(  [LEFT:LEFT  POS:<pos>]  );

				is fully ambiguated.

				This contradicts our choice that this GIVE got a non-
				fully ambiguated LEFT.

	  We conclude that there is no first call to GIVE passing in a LEFT that
	  is not fully ambiguated.  This implies that all GIVEs receive fully
	  ambiguated LEFTs, and hence Part B of the definition is proven.

		    BOX:	What does it mean for a CHOICE_OF_PHRASES to be fully
		    BOX:	ambiguated?
		    BOX:
		    BOX:	Why does GIVE ~preserve the fully ambiguated property
		    BOX:	on the variable C?


13.4
Only Legal Strings Are Accepted

	  We claim that ~all full-spanning visible phrases are in fact derivable
	  by performing rewrites upon the input string.	 We prove that here.
	  For shorthand, we will refer to full-spanning visible phrases as
	  "strings".


13.4.1
When One String Is ~Older Than Another

	  Given two strings, they may be identical starting at the left,
	  containing the same PHRASE blocks.  Let's think of each string as a
	~rightward going sequence of PHRASE blocks.  (This is opposite the
	direction we've used so far).

	  Suppose that only the first K-1 blocks in each string are identical.
	  This means that:

		    PHRASE(k)  in the first string

				is distinct from

		    PHRASE(k)  in the second string.

	  Let's name these two PHRASE blocks:

		PHRASE1(k)  is the PHRASE(k) in the first string, and
		PHRASE2(k)  is the PHRASE(k) in the second string.

	Each PHRASE(k) has a time when it was passed into GIVE.  Let's denote
	  this time via the notation:

		    Give_Time( PHRASE(k) )

	  Since no two PHRASE blocks are GIVen at the same moment, one came
	  before the other, i.e.,

		    Give_Time( PHRASE1(k) )	<   Give_Time( PHRASE2(k) )
	  or
		    Give_Time( PHRASE1(k) )	>   Give_Time( PHRASE2(k) )

	  We say that the first string is "older than" the second string if

		    Give_Time( PHRASE1(k) )	<   Give_Time( PHRASE2(k) )
	---------------- Parentheses in previous paragraph around "k"
	----------------	mean subscripting! ------------------------------------




13.4.2
Transitive Closure Of ~Older ~Than

	  Now suppose we have three strings, such that

		    string(1)  is older than	string(2)
	  and
		    string(2)  is older than	string(3)

	  We now show that the transitive property holds, i.e., that

		    string(1)  is older than	string(3)

	  Let PHRASE1(k) be the first (leftmost) PHRASE block in string(1) that
	  is not also in string(2).  By "is older than", we have:

		    Give_Time( PHRASE1(k) )	<   Give_Time( PHRASE2(k) )

	  Let's do the same for the second pair of strings.  Let PHRASE(j) be
	the first (leftmost) PHRASE block in string(2) that differs from
	PHRASE(j) in the third string.  By "is older than", we have

		Give_Time( PHRASE2(j) )   <   Give_Time( PHRASE3(j) )

	We now show that string(1) is "older than" string(3):

	   1)	If j=k, then we have:

			Give_Time( PHRASE1(k) )   <   Give_Time( PHRASE3(k) )

		We also know that the (K-1)'th PHRASE block, and each one
		    further to the left, are identical between string(1) and
		    string(3).

	     2)   If j < k,  then string(1) and string(3) differ at PHRASE(j).:

			 a)	We know that:

				    PHRASE1(j) = PHRASE2(j)

				since j < k, and only at K (further to the right) can
				string(1) and string(2) differ.

				This equality makes the following true:

				    Give_Time( PHRASE1(j) )  =  Give_Time( PHRASE2(j) )

			 b)	We have also that:

				    Give_Time( PHRASE2(j) )  <  Give_Time( PHRASE3(j) )

				since string(2) is older than string(3), and j is the
				(first) disagreement between string(2) and string(3).

				Put together, we have:

				    Give_Time( PHRASE1(j) )  <  Give_Time( PHRASE3(j) )

				We also know that all three strings are identical up to
				but not including PHRASE(j).

				Thus, string(1) is "older than" string(3).

	     3)   If k < j, then perform step (2), exchanging string(1) and
		    string(2).  (This makes j < k ).

	  We now know that "is older than" is transitive.  It is also anti-
	  reflexive because:

		    string(1)  is older than	string(2)
	  and
		    string(2)  is older than	string(1)

	  cannot both be true.	(Compare the Give_Times).
	---------------- Parentheses in previous paragraph around "1", "2",
	----------------	"3", "k", "j" mean subscripting!  ---------------------


13.4.3
When Does The First Illegal String Show Up?

	  Let's return to C, the rightmost CHOICE_OF_PHRASES.  We say that a
	string is "legal" if and only if it can be derived from the input
	string.  We show that all strings are legal:

	    Given an illegal string, who made it?

	    We proceed by forming a contradiction.

	    Pick an "oldest" illegal string, S.

		1)   The grammar didn't make it, because:

				Consider the latest rule application that formed S.
				It occured by noticing an instance of the want-phrase
				somewhere, and then completing S by introducing the
				give-phrase.

				Since the want-phrase and the give-phrase share the
				same span, we can take our illegal string S and replace
				the occurence of the give-phrase with an occurence of
				the want-phrase.

				This new string, say S1, which has the want-phrase must
				itself be illegal:

					  (If S1 were legal, then the one rewrite of
					  replacing the want-phrase with the give-phrase,
					  (producing S), would map S1 (legal) to another
					  legal string (S).  This contradicts our choice
					  of S to be illegal).

				S1 is "older" than S:

					  (S1 and S are identical to the left of the
					  want-phrase (also the give-phrase).  The want-
					  phrase had to exist prior to this rule's
				application, lest it never apply.  Thus, all
				the PHRASE blocks in the want-phrase are older
				than all the PHRASE block(s) in the give-
				phrase.  In particular, the leftmost PHRASE
				block in the want-phrase is older than the
				leftmost PHRASE block in the give-phrase.

			Given that S1 is illegal and that it is older than S,
			we have an even older illegal string, contradicting
			our choice of S as being the oldest illegal string.

		2)   The taking of user input didn't make S:

				We've just shown that no PHRASE block in S was
			generated by the grammar.  The only other PHRASE blocks
			are generated while taking user input.
			We could just as well turn off the grammar, supress-
			ing all calls to the grammar.  S would still appear.

			However, the only string
			introduced while taking user input is the input
			string.  This implies that S is the input string, and
			hence is legal.  This contradicts our choice that S be
			illegal.

	Therefore, all strings in C are legal strings.


13.4.4
A Detail - The Entire Give-Phrase Came From One Rule Application

	Our argument thus far assumed that the illegal string S contained
	some rule's give-phrase.  We assumed actually that S contained
	  ~all the PHRASE blocks generated by one application of one rule.
	  That was how we could focus in on an actual application, and prove
	  that an older occurence of the want-phrase had to exist.

	  Consider the latest rule application that formed S, the one that
	  ~triggered most recently.  (A rule ~triggers precisely when its want-
	  phrase is recognized).

	  We claim that ~all the PHRASE blocks in the generated give-phrase are
	  also contained in S:

		    No subsequent rule application can affect S, by assumption.

		    Thus, we could just as well have turned off the grammar while
		    generating our give-phrase.  S would still come to exist.

	  pic

		    With the grammar turned off, figure 13.3 shows our give-phrase.
		    Each CHOICE_OF_PHRASES contains only a single PHRASE block.
		    (Multiple PHRASEs in a CHOICE_OF_PHRASES can be generated only
		    upon (further) rule applications).

		    Thus, the only way to arrive at any block in the give-phrase
		    is to pass thru the rightmost PHRASE block.	 (We will prove in
		    the next section that a PHRASE block can be referenced from 
		    only one CHOICE_OF_PHRASES, the ones you see in the figure).

		    Thus, if any of the give-phrase's PHRASE blocks is in S,
		then the rightmost PHRASE block is also in S.

		Upon arriving at any PHRASE block in the give-phrase, there is
		no way out except to travel leftward along all PHRASE blocks
		in the give-phrase.  Since S is full spanning, S contains at
		least the leftmost block in the give-phrase.  Thus, all blocks
		of the give-phrase are in S.

	This says in conclusion that the blocks generated by some rule's
	  application are all contained in S.  Now that we have the rule
	  application, we can proceed as before.	We deduce the existence of the
	  rule's want-phrase, where all blocks in the want-phrase are older than
	all the blocks in the give-phrase.  As before, we exchange the want-
	phrase with the give-phrase, and arrive at the contradiction, by
	producing an illegal string even older than S.


13.4.5
More Properties Of GIVE

	Notice the following about GIVE and the grammar:

	   *	GIVE receives always a ~new PHRASE block.

			(We use always the notation:

				GIVE(  [LEFT: p.LEFT  POS: part-of-speech]  );

			 when calling GIVE, from the grammar or when taking
			 user input.  The [...] always creates a new block).

	   *	Only GIVE contributes elements to C.

	   *	GIVE only augments C, never removing an element from C.

	The following useful properties are then derived:

	   *	A PHRASE block is a member of only one CHOICE_OF_PHRASES

			(All new PHRASE blocks come into GIVE.  GIVE puts
			 each onto C once.  GIVE can't put it onto C ever
				 again because GIVE is called only with ~new PHRASE
				 blocks.  Since nowhere else are elements added to C,
				 a given PHRASE block is referenced from at most one
				 CHOICE_OF_PHRASES).

	     *    Given a visible phrase, it has a unique rightmost
		    CHOICE_OF_PHRASES.	It holds the visible phrase's rightmost
		(first) PHRASE block.


	BOX:	BOX:	When is one string older than another?
		BOX:
		BOX:	Given an oldest illegal string, how did we find an
		BOX:	even older illegal string?
		BOX:
		BOX:	Can two CHOICE_OF_PHRASES reference the same PHRASE
		BOX:	block?