CHAPTER 19


		    UPDATING OUR COMPILER TO INCLUDE FUNCTIONS



	  Let's update our compiler from chapter 8.  We want to introduce two
	new notations:

	    1)	Function calling:

			<ID> ( <EXPR> , <EXPR> , ... , <EXPR> )

	    2)	Function definition:

			DEFINE  <ID> ( <ID> , <ID> , ... , <ID> ):
			    <EXPR>
		        END_FUNCTION

		or with locals:

			DEFINE  <ID> ( <ID> , <ID> , ... , <ID> ):
			   LOCALS <ID>,<ID>,...,<ID>
			     <EXPR>
			END_FUNCTION


19.1
The Parts-Of-Speech <IDLIST> And <EXPR_LIST>

	Let's introduce the part-of-speech <IDLIST> to denote lists of <ID>s:

		    <ID> , <ID> , ... , <ID>

	  Let's also introduce the part-of-speech <EXPR_LIST> to denote lists of
	<EXPR>s, as we find in a function call:

		<EXPR> , <EXPR> , ... , <EXPR>

	Recall that we have the declarations for the parts-of-speech <ID> and
	<EXPR>:

		POS  ID : TEXT ;

		POS  EXPR[8] : - ;	"Invocation of the semantics
					 generates a machine language program
					 which leaves the answer in register 1"

	Let's declare the plurals of the types TEXT and BASIC_PROCESS:

		    TYPE	TEXTS =  { TEXT } ;

		    TYPE	BASIC_PROCESSES =	 { BASIC_PROCESS } ;

	  We now declare our new parts-of-speech:

		    POS  IDLIST : TEXTS ;

		    POS  EXPR_LIST : BASIC_PROCESSES ;

	  An <IDLIST>'s semantics is a list of identifiers (texts).  An EXPR_LIST
	is a list of BASIC_PROCESSes, a list of <EXPR>-semantics.

	We introduce the <IDLIST> and <EXPR_LIST> rules:

		<ID:x>				->	<IDLIST: {x} >
		<IDLIST:x>  ,  <ID:y>		->	<IDLIST: x $> y >

		<EXPR:x>			->	<EXPR_LIST: {x} >
		<EXPR_LIST:x> , <EXPR:y>	->	<EXPR_LIST: x $> y >

	(The expression "X $> Y" denotes the list X with one more element, Y,
	appended to the end.  The "{X}" denotes the list containing only the
	one element X.  See Sections 22.1.3 and 23.3).


19.2
The Function Calling Rule

	We introduce the function calling rule:

		<ID:name>  (  <EXPR_LIST: params>  )	   ->

					<EXPR: CALL_FUNCTION(name,params); >

	An <ID> followed by a list of parameters, <EXPR>s, enclosed in
	parentheses, denotes a call.  This supports at least the notations:

		POWER_OF_2( 5 )
	and
		POWER_OF_2( X )
	and
		POWER_OF_2( X+5 )

	Each of these denotes a call to the function whose name is the <ID>,
	POWER_OF_2 in these examples.  We will come back to define the
	semantic action CALL_FUNCTION.


19.3
The Function Definition Rule

	We introduce the function declaration rule:

		DEFINE  <ID:name> ( <IDLIST:params> ):
		   LOCALS <IDLIST:locals>
			<EXPR:body>
		END_FUNCTION				->

				" An immediate action... "

				DEFINE_FUNCTION( name, params, locals, body );

	This function definition rule rewrites to nothing.  It merely calls
	DEFINE_FUNCTION immediately upon recognition.


19.3.1
Keeping Track Of Functions

	We will need to keep track of ~all defined functions, so that we know
	which one to call upon seeing a function call.  We introduce the
	following type:

		TYPE	DEFINED_FUNCTION =  [NAME: TEXT  START_ADDRESS: LABEL];

	A DEFINED_FUNCTION associates a function NAME with its START_ADDRESS.
	The START_ADDRESS tells us where to CALL if we want to call the
	function of a given name (NAME).  (Recall from Chapters 6 and 7 that
	any address can be represented by a LABEL).

	The list of all functions is declared by:

		TYPE	FUNCTIONS =  { DEFINED_FUNCTION } ;

	Let's declare the variable that will hold all defined functions:

		    VAR	ALL_FUNCTIONS =  FUNCTIONS ;

	  The declaration of a function will add an entry onto ALL_FUNCTIONS.
	  Function calling will read ALL_FUNCTIONS.

		    BOX:	What new rules of grammar are introduced to facilitate
		    BOX:	function definition and calling?
		    BOX:
		    BOX:	What are <IDLIST> and <EXPR_LIST>?
		    BOX:
		    BOX:	How do we keep track of functions?


19.4
Function Calling Implementation

	  We implement the semantics for the function calling rule, the
	  procedure CALL_FUNCTION.  Recall that CALL_FUNCTION was referenced
	  within an <EXPR>.  To satisfy the conventions of <EXPR>s, CALL_FUNCTION
	  must generate code that leaves the result of the function in register
	  1.

	  We assume that our other function DEFINE_FUNCTION (coming up soon),
	  compiles each function so as to leave the function's value in register
	1.  Thus, when we call a function, the result will be in register 1
	by the time we reach just beyond the CALL instruction.

	Here is CALL_FUNCTION:

	   DEFINE  CALL_FUNCTION( NAME: TEXT  PARAMS: BASIC_PROCESSES ):

	      BEGIN  VAR DF=DEFINED_FUNCTION; E=BASIC_PROCESS;

		IF  "there is a defined function of our given NAME... "

			THERE_IS  DF.NAME = NAME  FOR DF $E ALL_FUNCTIONS;

		THEN	"DF holds the function we want to call."
			"Call it by first pushing each parameter onto the
			 stack..."

			FOR E $E PARAMS;  "For each argument E..."
			  DO
				<*E*>;	"Generate machine language which
					 evaluates the parameter and puts its
					 value into register 1."

				"Push register 1 onto the stack..."
				ADD( SP, ADDRESS_OF(1) );
				STORE( 1, 0 \OFF_OF SP );
			  END

			"All parameters have been pushed.  Now issue the call
			 to the function's start address..."

				CALL( RETURN, DF.START_ADDRESS );

				"The answer is now in register 1, by convention"

		    ELSE	HELP;	  "No function of the given NAME can be found"
		    FI
		  END
	     ENDDEFN

	  CALL_FUNCTION pushes the parameters and CALLs the named function, if
	  that function exists.	 If none is found, the ELSE-clause executes,
	  calling HELP.  Eventually, that HELP should be replaced by an
	  informative message to the user, indicating that he or she has called
	  a nonexistent function.

	  (The "THERE_IS" notation is covered in Section 22.1.8.  The
	  "FOR E $E PARAMS;" appears in Section 22.3.1).


19.5
Function Definition Implementation

	  We now introduce the function-definition semantics, the function
	  DEFINE_FUNCTION:

	     DEFINE	 DEFINE_FUNCTION( NAME: TEXT	PARAMS,LOCALS: TEXTS
									   BODY: BASIC_PROCESS ):

		    1)  Create a label and append onto ALL_FUNCTIONS that label
			  with this function NAME.

		    2)  Generate machine language for this function:

				a)  Generate the initial program fragment

				b)  Modify the symbol table so that each parameter and
				    each local is associated with an address expression
				    like "-3 \OFF_OF FP"

				c)  Generate the BODY

				d)  Generate the final program fragment.
	  ENDDEFN

	  We define this formally as:

	     DEFINE	 DEFINE_FUNCTION( NAME: TEXT	PARAMS,LOCALS: TEXTS
									   BODY: BASIC_PROCESS ):
		  BEGIN  VAR  T = TEXT;	 I, N_PARAMS, N_LOCALS = INT;	 START=LABEL;

		    "1)  Create the start address label, and augment ALL_FUNCTIONS"

		    START:= NEW;

		    ALL_FUNCTIONS:=  ALL_FUNCTIONS	$>
							   [NAME:NAME  START_ADDRESS:START] ;

		    "Count up the number of parameters and the number of locals..."

		    N_PARAMS:= LENGTH( PARAMS );
		    N_LOCALS:= LENGTH( LOCALS );

		    "2) Generate machine language for the function..."
		    "	  a) The initial program fragment... "

		    EQU(START);	  "Define the start address to be here"

				STORE( RETURN, 1 \OFF_OF SP );    "(as shown earlier)"
				STORE( FP, 2 \OFF_OF SP );
				LOAD( FP, SP );
				ADD( SP, ADDRESS_OF( 2 + N_LOCALS ) );

		    "	  b) Modify the symbol table so that each parameter and local
			     is associated with an address expression... "

		    HOLDING SYMBOL_TABLE;   "Our effect on SYMBOL_TABLE is meant to
						     be temporary"
		    DO
				( FOR T $E PARAMS;  &&	FOR I FROM -N_PARAMS+1 BY 1; )
				THEN
				( FOR T $E LOCALS;  &&	FOR I FROM 3 BY 1; )
				DO
					  SYMBOL_TABLE:= [NAME:T  ADDRESS: I \OFF_OF FP ]
							     <$  SYMBOL_TABLE;
				END

				" c)	Generate the body... "

				<*BODY*>;

				"The answer is in register 1, because BODY is the
				 semantics of an EXPR."

		    ENDHOLD		  "(pull back the symbol table to where it was)"

		    "	  d) The final program fragment... "

		    LOAD( RETURN, 1 \OFF_OF FP );
		    LOAD( FP , 2 \OFF_OF FP );
		    SUB( SP, ADDRESS_OF( N_PARAMS + N_LOCALS + 2 ) );
		    JUMP( 0 \OFF_OF RETURN );
		  END
	     ENDDEFN

	  We have now implemented function definition and function calling.

	  (Step 2b, where we augment the symbol table, we perform:

		    ( FOR T $E PARAMS;	&&  FOR I FROM -N_PARAMS+1 BY 1; )
		    THEN
		    ( FOR T $E LOCALS;	&&  FOR I FROM 3 BY 1; )
		    DO
				...
		    END

	  The first three lines form a quantifier, built out of four "FOR"
	  quantifiers.  The "&&" and "THEN" operators that combine the four
	  quantifiers are covered in Section 22.3.2.

	  The first line sets T to each parameter name, and simultaneously sets
	  I to be the correct displacement off of FP.  Then, ~after all those
	  iterations are complete, the third line sets T to each local variable
	  name, and simultaneously sets I to be the correct displacement for that
	  local.  The "..." augments the symbol table on each iteration).


		    BOX:	How many elements are pushed onto the stack by the code
		    BOX:	generated by CALL_FUNCTION?
		    BOX:
		    BOX:	How many instructions does DEFINE_FUNCTION generate
		    BOX:	upon entry to the function, and then upon exit?
		    BOX:
		    BOX:	How long do we keep the local and parameter values
		    BOX:	in SYMBOL_TABLE?
		    BOX:
		    BOX:	What earlier program of ours uses SYMBOL_TABLE
		    BOX:	(in Chapter 8)?
		    BOX:
		    BOX:	Why will the locals and parameters appear to be
		    BOX:	available?


19.5.1
A Slight Modification To Chapter 8's SYMBOL_TABLE

	  A detail must be noted relative to chapter 8's compiler.	The
	  type of the variable SYMBOL_TABLE was declared by:

		    TYPE	NAME_AND_ADDRESS =  [NAME:TEXT  ADDRESS:LABEL] ;

				SYMBOL_TABLE =  { NAME_AND_ADDRESS } ;

	  We must change the LABEL to now be a MACHINE_ADDRESS_EXPRESSION,
	  which is capable of representing address expressions like 
	  "I \OFF_OF FP".	 MACHINE_ADDRESS_EXPRESSION was introduced in Section
	  9.8.1.

	  Also, the function:

		    ADDRESS_OF_VARIABLE( TEXT )	->	  LABEL

	  must now be changed so as to deliver a MACHINE_ADDRESS_EXPRESSION
	  instead of a LABEL:

		    ADDRESS_OF_VARIABLE( TEXT )	->   MACHINE_ADDRESS_EXPRESSION

	  All the other parts in Chapter 8 need no further change.

	  (The LABELs coming out of ADDRESS_OF_VARIABLE were used ~only as
	  operands in instructions.  Instructions always take in
	  MACHINE_ADDRESS_EXPRESSIONs, as of Section 9.8.1.  Thus, all
	  uses of ADDRESS_OF_VARIABLE will be automatically upward compatible
	  with this change).

	  (We had gotten by supplying merely LABELs because of the two coercions
	  that let a LABEL be seen also as a MACHINE_ADDRESS_EXPRESSION:

		    LABEL	->   ADDRESS_EXPRESSION	  "from the definition of
								   the type ADDRESS_EXPRESSION"
	  and
		    ADDRESS_EXPRESSION	 ->	MACHINE_ADDRESS_EXPRESSION
								  "from Section 9.8"


19.6
Mutual Recursion

	  It is possible that a function F calls a function G, and that G also
	  calls F.	This relationship is known as ~mutual ~recursion.

	  Suppose F's definition appears before G's definition.  While F's
	  definition is being processed, G hasn't been seen yet.  Therefore,
	  when F calls G, there will be an error because G is unknown at this
	  time.

	  Switching the order of F's and G's definitions won't solve the problem,
	  as during G's processing, F won't be known yet.

	  We would therefore like to delay the processing of any function
	  definition until ~after all function definitions have been seen.

	  Let's change DEFINE_FUNCTION so that it augments ALL_FUNCTIONS
	  immediately as it does now, but let's delay the generation of the
	  function's machine language program.  Where DEFINE_FUNCTION now appears
	  as:

	     DEFINE	 DEFINE_FUNCTION( NAME: TEXT	PARAMS,LOCALS: {TEXT}
									   BODY: BASIC_PROCESS	     ):
		  BEGIN  VAR  T = TEXT;	 I, N_PARAMS, N_LOCALS = INT;	 START=LABEL;

		    START:= NEW;

		    ALL_FUNCTIONS:= ALL_FUNCTIONS $>
							   [NAME:NAME  START_ADDRESS:START] ;

		    ***

		  END
	     ENDDEFN

	  let's change it to be:

	     DEFINE	 DEFINE_FUNCTION( NAME: TEXT	PARAMS,LOCALS: TEXTS
									   BODY: BASIC_PROCESS ):
		  BEGIN  VAR  START=LABEL;

		    START:= NEW;

		    ALL_FUNCTIONS:= ALL_FUNCTIONS $>
							   [NAME:NAME  START_ADDRESS:START] ;

		    POST_PROCESSING :=	POST_PROCESSING  $>

						    //[PARAMS;LOCALS;BODY;START;]

								BEGIN	 VAR T=TEXT; N_PARAMS,
									     N_LOCALS=INT;
								  ***
								END
						    \\ ;
		  END
	     ENDDEFN

	  We've taken everything beyond our augmentation of ALL_FUNCTIONS, and
	  enclose it within //...\\.	This delays its execution.  When it will
	  be invoked, it will generate the machine language program for this
	  function.

	  That is, instead of generating machine language now, we enclose that
	  generation within //...\\, forming an BASIC_PROCESS, and append it to a new list
	  POST_PROCESSING.  This doesn't generate machine language now, and so no
	  function ~calls will be considered yet.

	  (We declare POST_PROCESSING to hold such BASIC_PROCESSes by:

		    VAR	POST_PROCESSING =	 BASIC_PROCESSES	;


	  Let's require the programmer to write END_OF_PROGRAM so that we know
	  when the end is reached.  We introduce this new rule of grammar, and
	  have it immediately invoke all the BASIC_PROCESSs on POST_PROCESSING,
	  via:

		    END_OF_PROGRAM    ->    BEGIN  VAR  S = BASIC_PROCESS ;
							 FOR S $E POST_PROCESSING;
							 DO  <*S*>;	  END
						    END

	  Thus when END_OF_PROGRAM is seen, we will proceed to generate machine
	  language for each function.	 (Each BASIC_PROCESS in POST_PROCESSING generates
	  the machine language for one function).

	  Consider our example where F calls G and G calls F.	 F and G will be
	  entered onto ALL_FUNCTIONS and POST_PROCESSING by the time
	  END_OF_PROGRAM is seen.  No matter whether F or G appears first on
	  POST_PROCESSING, both will have their machine language generated in the
	  context where both F and G are known, on ALL_FUNCTIONS.

	  This simple change solves the mutual recursion problem.  Unfortunately,
	  PASCAL doesn't solve it; instead it forces the user to double-declare
	  functions, using FORWARD statements.  It is the type BASIC_PROCESS in
	  ICL makes our easy solution possible.


		    BOX:	What is ~mutual ~recursion?
		    BOX:
		    BOX:	How do we solve the mutual recursion problem?
		    BOX:
		    BOX:	What is END_OF_PROGRAM for?


19.7
Using Local Variables For Intermediate Results

	  To make the whole system including chapter 8 work, we need to
	  change the <EXPR>-<BOP>-<EXPR> rule and the <EXPR>-<COMPARATOR>-<EXPR>
	  rule.  Let's concentrate on the <EXPR>-<BOP>-<EXPR> rule.	 Its
	  original text appears now:

	     <EXPR[i]: left >  <BOP[j]: op >  <EXPR[k]: right >	  ->

		 IF  i =< j	 &  k < j  THEN

		    <EXPR[j]:

			  HOLDING  SPARE_MEMORY:= SPARE_MEMORY + 1;   "Allocate spare
										     bin in memory"
			  DO
				<*RIGHT*>;		"Get righthand side into register 1"

				STORE(1,SPARE_MEMORY);	"Store it in memory"

				<*LEFT*>;		"Get lefthand side into register 1"

				HOLDING  RIGHT_OPERAND:= SPARE_MEMORY;

				   DO	   <*OP*>;	 "Operate on register 1 and
							  SPARE_MEMORY. Answer is in
							  register 1"
				ENDHOLD

			  ENDHOLD		    >
		 FI

	  This rule uses SPARE_MEMORY to allocate memory for intermediate
	  results.	Each time the expression is executed, it uses that same
	  memory.

	  The subtle but devastating problem we wish to avoid is the
	  reuse of that same memory upon recursive function calls.	For example,
	  the following function F calls itself recursively, from within the
	  middle of an expression:

		    DEFINE	F( X: REAL ) = REAL :

			  IF	X = 0	 THEN	 0

			  ELSE   2*X + F(X-1)	 FI

		    ENDDEFN

	  Concentrate on the ELSE-clause.  The expression

		    2*X + F(X-1)

	  first computes 2*X and stores this intermediate result into a fixed
	  memory location, one that was allocated by SPARE_MEMORY while
	  translating this expression (<EXPR>) into machine language.

	  Naturally, we expect this intermediate value to be preserved safely
	  while executing the righthand part of the expression, F(X-1).
	  Unfortunately, F(X-1) calls this same function, which in its ELSE-
	  clause may need to evaluate the expression again.  Upon this second
	  incarnation of F, where X is one smaller than in the first
	  incarnation, 2*X will be computed and stored into the same fixed bin
	  in memory.  This overwrites the data put there by the first incarnation
	  of F!

	  Thus, when the second incarnation returns and the first incarnation
	  resumes execution, it will pick up not the value 2*X that it had
	  computed, but the 2*(X-1) stored there by the second incaranation of F,
	  the F(X-1).

	  For recursive functions like F, it is not acceptable to store
	  intermediate results into fixed bins in memory.  Instead, let's store
	  all intermediate results on the stack, like we do with local variables.

	  Let's use SPARE_MEMORY, not as a fixed bin in memory, but as an
	  offset from the frame pointer, i.e., as in:

		    SPARE_MEMORY	\OFF_OF  FP

	  This address expression using SPARE_MEMORY denotes a local
	  variable.	 (Recall that all local variables are represented by address
	  expressions of the form

		    I	 \OFF_OF  FP	    ).


	

	  The new spare memory is allocated as a new local variable on the stack,
	  always relative to FP, as in figure 19.1.  This is a safe use of
	  memory because different incarnations of the same function always have
	  different values in FP, thus keeping their locals entirely separate.

	  Here's the updated rule:

		<EXPR[i]:left>  <BOP[j]:op>  <EXPR[j]:right>	    ->

		  IF	I =< J  &  K < J	THEN

		    <EXPR[j]:
				HOLDING  SPARE_MEMORY:= SPARE_MEMORY + 1 ;
				DO
					  <*RIGHT*>;
					  STORE( 1, SPARE_MEMORY \OFF_OF FP );

					  <*LEFT*>;

					  HOLDING  RIGHT_OPERAND:=
								    SPARE_MEMORY \OFF_OF FP ;
					  DO
						    <*OP*>;
					  ENDHOLD

					  MAXIMUM_LOCAL:= MAXIMUM_LOCAL \MAX
										    SPARE_MEMORY ;
				ENDHOLD
		    >
		  FI

	  This looks very much like the original in chapter 8.  Now, we always
	  use SPARE_MEMORY in the address expression

		    SPARE_MEMORY	\OFF_OF  FP

	  so that the bin of intermediate storage is a local variable.  Our
	  incrementing of SPARE_MEMORY, like before, allocates a new variable,
	  a local variable, to hold the intermediate value, RIGHT's value.


19.7.1
Keeping Track Of All Locals

	  We've introduced the following statement at the end:

		    MAXIMUM_LOCAL:= MAXIMUM_LOCAL  \MAX  SPARE_MEMORY ;

	  MAXIMUM_LOCAL records the ~highest value in SPARE_MEMORY ever used,
	  and thus participates in determining how big the locals section should
	  be.	 The value in MAXIMUM_LOCAL will be read ultimately within
	  DEFINE_FUNCTION, as it increments SP to allocate space for locals.


		    BOX:	Why must we store intermediate results on the stack
		    BOX:	instead of in fixed bins in memory?
		    BOX:
		    BOX:	What change did we make to get intermediate results
		    BOX:	to reside on the stack?
		    BOX:
		    BOX:	Why do we use FP instead of SP for local variables'
		    BOX:	address expressions?


19.7.2
The Late Arrival Of MAXIMUM_LOCAL

	  Unfortunately, MAXIMUM_LOCAL does not have the correct value until
	  ~after the function body is translated into machine language.  Only
	  during the invocation of the body, "<*BODY*>;", will this
	  <EXPR>-<BOP>-<EXPR> rule's semantics be invoked, and so only then will
	  MAXIMUM_LOCAL be made up to date.

	  However, we need MAXIMUM_LOCAL ~prior to the function body.  We need
	  it for generating the initial program fragment, as we increment SP
	  to allcoate space for ~all locals.

	  To accomodate the late arrival of MAXIMUM_LOCAL, we can proceed in
	  either of two ways.  First, we could generate the initial program
	  fragment ~last, as in:

	     label:		the function body

				the final program fragment

	     START:		the initial program fragment
				JUMP( label );

	  This late initial fragment is late enough to have had the function
	  body already generated and hence to have had MAXIMUM_LOCAL updated
	  completely.  That late initial segment ultimately JUMPs backwards to
	  the new label, where the function body starts.

	  We must first initialize SPARE_MEMORY and MAXIMUM_LOCAL before we
	  invoke the BODY:

		    SPARE_MEMORY:= N_LOCALS + 2 ;

		    MAXIMUM_LOCAL:= SPARE_MEMORY ;

	  After the function body is generated (the <*BODY*>;"), MAXIMUM_LOCAL
	  will tell exactly how much to increment SP so as to contain all locals,
	  both declared locals (from the LOCALS clause) and the locals allocated
	  via SPARE_MEMORY.

	  Finally, our old DEFINE_FUNCTION is rendered up to date by replacing
	  appearences of N_LOCALS in the ADD and SUB instructions.	There,
	  N_LOCALS always appears in an expression:

		    N_LOCALS + 2

	  Replace that expression now by MAXIMUM_LOCAL.


19.7.3
Another Approach

	  Another option is to leave the initial program fragment up front,
	  and to create a label that represents the number of locals, e.g.:

		    "the initial program fragment"
		    EQU(START);

				FINAL_NUMBER_OF_LOCALS:= NEW;	  "the new label"

				STORE( RETURN, 1 \OFF_OF SP );  "unchanged"
				STORE( FP, 2 \OFF_OF SP );	  "unchanged"
				LOAD( FP, SP );			  "uncahnged"

				ADD( SP , ADDRESS_OF( FINAL_NUMBER_OF_LOCALS ) );

	  The value of the label FINAL_NUMBER_OF_LOCALS is unknown at this time.

	  After the invocation of BODY, we perform:

		    EQU( FINAL_NUMBER_OF_LOCALS, MAXIMUM_LOCAL ) ;

	  to assign the right value to the label.


		    BOX:	Why does MAXIMUM_LOCAL become up to date late?
		    BOX:
		    BOX:	Why is SPARE_MEMORY initialized to N_LOCALS+2?
		    BOX:
		    BOX:	How do we know the MAXIMUM_LOCAL will be big enough to
		    BOX:	account for declared locals, as well as locals
		    BOX:	allocated by the <EXPR>-<BOP>-<EXPR> rule?


19.8
For Debugging Aid

	

	  It may be that during program execution you will need to discover the
	  name of the function you are in.	This is facilitated by pushing one
	  more item onto the stack upon function entry.	 Figure 19.2 shows an
	  extra bin pushed, the ~name of the function.

	  Now, if you are executing inside a function and you want to know the
	  function's name, the address expression "3 \OFF_OF FP" hold the
	pointer to this function's name.

	  Often, you want to find out not only what function you're in, but also
	which function called this function, and even what function called that
	one.  This is all meant to answer the question "How did I get here?"

	

	Figure 19.3(a) shows the stack after F1 calls F2.  There are two
	frames here.  One for F1, the outer function, and one for F2, the inner
	function.  The bin containing old FP appears here as a pointer back to
	the previous frame.

	Figure 19.3(b) shows three function calls: F1 calls F2 which calls F3.
	Starting at FP, you find the name of the most recent function, F3,
	via:

		LOAD( 1, 3 \OFF_OF FP );	"Register 1 references the
						 function name, 'F3'"

	Similarly, you find the next outer function by first acquiring the old
	FP, via:

		LOAD( 1, 2 \OFF_OF FP );	"old FP"

	and then acquiring that function's name, F2, via:

		    LOAD( 1, 3 \OFF_OF 1 );		"Register 1 references the name
								 F2"

	  Finally, you acquire the name of the third outer function via:

		    LOAD( 1, 2 \OFF_OF FP );		"old FP"
		    LOAD( 1, 2 \OFF_OF 1  );		"older FP"
		    LOAD( 1, 3 \OFF_OF 1  );		"the name 'F3'"

	  You can now answer the question "How did I get here?" by providing a
	  ~backtrace.  A backtrace is a list of the function names, starting with
	  the most recently called function, followed by the next outer function
	  that called this one, etc.

	  In reverse order, the backtrace tells you what function called what
	  function, etc., all the way until it ends with the function in which
	  you are executing presently.  Backtraces are essential during
	  debugging, because "How did I get here?" is the most popular question!


		    BOX:	What is a ~backtrace?
		    BOX:
		    BOX:	How does one stack frame point to the previous one?
		    BOX:
		    BOX:	How many instructions would it take to get the name
		    BOX:	of the N'th function in the backtrace?
		BOX:
		BOX:	Can you imagine a way by which the function name could
		BOX:	be acquired ~without any extra instructions, i.e.,
		BOX:	without pushing that extra pointer onto the stack?


19.9
Another Implementation Of Stack Frames

	Often, greater efficiency in function calling can be gained by another
	method of managing stack frames.  This method is used in the MIPS
	compilers [].  We omit the FP altogether, and relieve the original
	need for it.  We needed FP only because SP could change, due to pushes
	onto the stack.  Let's now insist that SP never changes, that is, we
	  will forbid the direct use of pushes and pops.

	

	  Figure 19.4 shows our stack frame, now without FP and without any
	  storage for old FP.  We increment SP so as to allocate not only locals,
	  but some additional space as well, where we will put the parameters for
	  functions we call.

	  Instead of pushing parameters onto the stack, we will store them
	  directly into the locations:

		    0 \OFF_OF SP		    for the last parameter
		    -1 \OFF_OF SP		    for the second to last parameter
		    ...
		    -K+1 \OFF_OF SP	    for the first parameter in K parameters

	  Thus, to the function we call, it still looks as though its parameters
	  were pushed onto the stack.


19.9.1
Watch Out For Nested Function Calls

	

	  However, we must be careful how we "push" parameters onto the stack.
	  Figure 19.5(a) shows what we expect the stack to look like prior to
	  issuing the CALL instruction, if we're passing in two parameters
	param1 and param2.

	Imagine that we store param1 there first.  The subsequent computation
	of param2 may itself need to call a function, e.g., as in:

		F( X , G(Y,Z) )

	(The function G must be called to deliver the second parameter for F).
	Unfortunately, that computation (G) will use the ~same parameter
	space.  Figure 19.5(b) shows that in calling G, those two final bins
	in the parameter space will be overwritten, ~killing ~our ~original
	~param1!

	It is no longer acceptable to store parameters directly into parameter
	space as each is computed.  All parameters, except the last, must be
	stored elsewhere, somewhere in the locals space.

	With the original method, the stack top (SP) would grow dynamically,
	which assured that one could put parameters right where one wanted
	them, on the top of the stack.  The evaluation of other parameters
	would increment SP, and hence use a ~higher part of the stack, leaving
	our original parameters unscathed in their final positions on the
	stack.  The value of SP might be different at each call.

	With the new method, time is saved by no longer having to save and
	restore FP, although this savings may be consumed in the extra
	movement of parameters.  For functions having two or less parameters,
	only the first parameter need be moved, at an expense of one LOAD and
	one STORE.  In contrast, omitting FP saves us three instructions (the
	instructions involving FP).

	Thus if a majority of your functions have two or fewer parameters,
	this latest method for treating stack frames is the most efficient.
	Generally, the majority of function calls do involve two or less
	parameters.


		BOX:	In using the new conventions, what instructions can
		BOX:	be omitted from the initial and final program
		BOX:	fragments?
		BOX:
		BOX:	Why can we omit FP altogether?
		BOX:
		BOX:	What extra complication arises with this method?
		BOX:
		BOX:	Which set of conventions is most efficient for
		BOX:	functions having many parameters?


19.10
Passing Parameters By ~Value, By ~Name, And By ~Reference

	The passing of parameters into functions can be subject to different
	interpretations.  For example, suppose we define F as follows:

		DEFINE	F( X: REAL ) :
		   BEGIN  VAR  Y,Z=REAL;

			Z:= 0;
			FOR Y FROM 1 TO 5;  DO   Z:= Z+X ;   END
			WRITE(Z);

		   END
		ENDDEFN

	F finally writes the value in Z.  Suppose we call F via:

		F( Y+1 )


19.10.1
Passing By Value

	The most intuitive interpretation is called passing by ~value.
	We've adhered throughout this text to this form of parameter passing.

	  If Y holds presently a 1, then our function call:

		    F(Y+1)

	  will call F with the value 2.  When F writes Z, we will see a 10,
	  because Z is the sum of X made 5 times, and X containes the ~value 2.
	  F's parameter X is substituted with its value, a 2, as in:

		FOR Y FROM 1 TO 5;  DO   Z:= Z+2 ;   END


19.10.2
Passing By Name

	A very different interpretation, called passing by ~name, would have F
	write the number 20.  Passing by name means that we pass in not the
	value of Y+1, a 2, but instead, the expression "Y+1" itself.  Inside of
	F, the parameter X is substituted not by the value 2, but by the
	expression "Y+1".

	The FOR-loop, with this substitution, reads as:

		FOR Y FROM 1 TO 5;  DO   Z:= Z + Y+1 ;   END

	This adds together the numbers Y+1 where Y takes on the values 1 thru
	5:

		1+1 + 2+1 + 3+1 + 4+1 + 5+1

	Passing by name thus passes programs, e.g., "Y+1".

	One disasterous consequence of passing by name is the dependence of
	parameters, e.g., "Y+1", on local variables within the function, e.g.,
	F's local variable Y.  The disaster occurs when the author of F changes
	  F, perhaps deleting or renaming the local variable Y.

	  A safe way to achieve the effects of passing by name is offered in ICL.
	  ICL passes parameters only by value, but that value can itself be a
	  program.

	  Programs passed around as data are called processes in ICL, and are
	  covered in Section 23.6.  They provide an airtight way to pass
	  programs.	 The process passed in ("Y+1") has no access to the local
	  variable Y.  Instead, the function F ~explicitly passes data into the
	  given process.


19.10.3
Symmetry Between Parameter Passage And Assignment Statements

	  One might ask "If I can pass a parameter by name, why can't I ~assign
	  by name into a variable?"  That is, if we can pass Y+1 by name, why
	  can't we perform:

		V:= Y+1 ;

	and have V represent the program "Y+1" from here after?

	This symmetry between parameter passing and assignment statements is
	preserved in ICL, as processes can be assigned into variables as well
	as being passed in as parameters.


19.10.4
Pass By Reference

	Another interpretation for what passing parameters means is exposed in
	this example:

		DEFINE	F( X: REAL ):

		    X:= 5;

		END

	We are assigning a value into our parameter.

	Let's call F, as in:

		    F(Z) ;

	  Does this call have the side effect of setting Z to 5?  If we're
	passing by value, the answer is no.  If we're passing by reference, the
	  answer is yes.

	  ICL supresses this kind of backwards transmission of information made
	  possible by pass-by-reference, to minimize side-effects.	However,
	  ICL's objective modification (the "@" in Chapter 10, or Section 22.1.11)
	can be used to achieve the same effect.  You pass in a block of memory
	and have the function write directly into it, via objective
	modification.

	This concept of objective modification exists independently of
	parameter passing.  Thus, the symmetry between parameter passing and
	assignment statements continues to be maintained in ICL.


		BOX:	What is meant by ~pass-by-value, ~pass-by-reference,
		BOX:	and ~pass-by-name?
		BOX:
		BOX:	How can ~pass-by-name destroy an otherwise airtight
		BOX:	separations between the insides and outsides of a
		BOX:	function?