CHAPTER 19
UPDATING OUR COMPILER TO INCLUDE FUNCTIONS
Let's update our compiler from chapter 8. We want to introduce two
new notations:
1) Function calling:
<ID> ( <EXPR> , <EXPR> , ... , <EXPR> )
2) Function definition:
DEFINE <ID> ( <ID> , <ID> , ... , <ID> ):
<EXPR>
END_FUNCTION
or with locals:
DEFINE <ID> ( <ID> , <ID> , ... , <ID> ):
LOCALS <ID>,<ID>,...,<ID>
<EXPR>
END_FUNCTION
19.1
The Parts-Of-Speech <IDLIST> And <EXPR_LIST>
Let's introduce the part-of-speech <IDLIST> to denote lists of <ID>s:
<ID> , <ID> , ... , <ID>
Let's also introduce the part-of-speech <EXPR_LIST> to denote lists of
<EXPR>s, as we find in a function call:
<EXPR> , <EXPR> , ... , <EXPR>
Recall that we have the declarations for the parts-of-speech <ID> and
<EXPR>:
POS ID : TEXT ;
POS EXPR[8] : - ; "Invocation of the semantics
generates a machine language program
which leaves the answer in register 1"
Let's declare the plurals of the types TEXT and BASIC_PROCESS:
TYPE TEXTS = { TEXT } ;
TYPE BASIC_PROCESSES = { BASIC_PROCESS } ;
We now declare our new parts-of-speech:
POS IDLIST : TEXTS ;
POS EXPR_LIST : BASIC_PROCESSES ;
An <IDLIST>'s semantics is a list of identifiers (texts). An EXPR_LIST
is a list of BASIC_PROCESSes, a list of <EXPR>-semantics.
We introduce the <IDLIST> and <EXPR_LIST> rules:
<ID:x> -> <IDLIST: {x} >
<IDLIST:x> , <ID:y> -> <IDLIST: x $> y >
<EXPR:x> -> <EXPR_LIST: {x} >
<EXPR_LIST:x> , <EXPR:y> -> <EXPR_LIST: x $> y >
(The expression "X $> Y" denotes the list X with one more element, Y,
appended to the end. The "{X}" denotes the list containing only the
one element X. See Sections 22.1.3 and 23.3).
19.2
The Function Calling Rule
We introduce the function calling rule:
<ID:name> ( <EXPR_LIST: params> ) ->
<EXPR: CALL_FUNCTION(name,params); >
An <ID> followed by a list of parameters, <EXPR>s, enclosed in
parentheses, denotes a call. This supports at least the notations:
POWER_OF_2( 5 )
and
POWER_OF_2( X )
and
POWER_OF_2( X+5 )
Each of these denotes a call to the function whose name is the <ID>,
POWER_OF_2 in these examples. We will come back to define the
semantic action CALL_FUNCTION.
19.3
The Function Definition Rule
We introduce the function declaration rule:
DEFINE <ID:name> ( <IDLIST:params> ):
LOCALS <IDLIST:locals>
<EXPR:body>
END_FUNCTION ->
" An immediate action... "
DEFINE_FUNCTION( name, params, locals, body );
This function definition rule rewrites to nothing. It merely calls
DEFINE_FUNCTION immediately upon recognition.
19.3.1
Keeping Track Of Functions
We will need to keep track of ~all defined functions, so that we know
which one to call upon seeing a function call. We introduce the
following type:
TYPE DEFINED_FUNCTION = [NAME: TEXT START_ADDRESS: LABEL];
A DEFINED_FUNCTION associates a function NAME with its START_ADDRESS.
The START_ADDRESS tells us where to CALL if we want to call the
function of a given name (NAME). (Recall from Chapters 6 and 7 that
any address can be represented by a LABEL).
The list of all functions is declared by:
TYPE FUNCTIONS = { DEFINED_FUNCTION } ;
Let's declare the variable that will hold all defined functions:
VAR ALL_FUNCTIONS = FUNCTIONS ;
The declaration of a function will add an entry onto ALL_FUNCTIONS.
Function calling will read ALL_FUNCTIONS.
BOX: What new rules of grammar are introduced to facilitate
BOX: function definition and calling?
BOX:
BOX: What are <IDLIST> and <EXPR_LIST>?
BOX:
BOX: How do we keep track of functions?
19.4
Function Calling Implementation
We implement the semantics for the function calling rule, the
procedure CALL_FUNCTION. Recall that CALL_FUNCTION was referenced
within an <EXPR>. To satisfy the conventions of <EXPR>s, CALL_FUNCTION
must generate code that leaves the result of the function in register
1.
We assume that our other function DEFINE_FUNCTION (coming up soon),
compiles each function so as to leave the function's value in register
1. Thus, when we call a function, the result will be in register 1
by the time we reach just beyond the CALL instruction.
Here is CALL_FUNCTION:
DEFINE CALL_FUNCTION( NAME: TEXT PARAMS: BASIC_PROCESSES ):
BEGIN VAR DF=DEFINED_FUNCTION; E=BASIC_PROCESS;
IF "there is a defined function of our given NAME... "
THERE_IS DF.NAME = NAME FOR DF $E ALL_FUNCTIONS;
THEN "DF holds the function we want to call."
"Call it by first pushing each parameter onto the
stack..."
FOR E $E PARAMS; "For each argument E..."
DO
<*E*>; "Generate machine language which
evaluates the parameter and puts its
value into register 1."
"Push register 1 onto the stack..."
ADD( SP, ADDRESS_OF(1) );
STORE( 1, 0 \OFF_OF SP );
END
"All parameters have been pushed. Now issue the call
to the function's start address..."
CALL( RETURN, DF.START_ADDRESS );
"The answer is now in register 1, by convention"
ELSE HELP; "No function of the given NAME can be found"
FI
END
ENDDEFN
CALL_FUNCTION pushes the parameters and CALLs the named function, if
that function exists. If none is found, the ELSE-clause executes,
calling HELP. Eventually, that HELP should be replaced by an
informative message to the user, indicating that he or she has called
a nonexistent function.
(The "THERE_IS" notation is covered in Section 22.1.8. The
"FOR E $E PARAMS;" appears in Section 22.3.1).
19.5
Function Definition Implementation
We now introduce the function-definition semantics, the function
DEFINE_FUNCTION:
DEFINE DEFINE_FUNCTION( NAME: TEXT PARAMS,LOCALS: TEXTS
BODY: BASIC_PROCESS ):
1) Create a label and append onto ALL_FUNCTIONS that label
with this function NAME.
2) Generate machine language for this function:
a) Generate the initial program fragment
b) Modify the symbol table so that each parameter and
each local is associated with an address expression
like "-3 \OFF_OF FP"
c) Generate the BODY
d) Generate the final program fragment.
ENDDEFN
We define this formally as:
DEFINE DEFINE_FUNCTION( NAME: TEXT PARAMS,LOCALS: TEXTS
BODY: BASIC_PROCESS ):
BEGIN VAR T = TEXT; I, N_PARAMS, N_LOCALS = INT; START=LABEL;
"1) Create the start address label, and augment ALL_FUNCTIONS"
START:= NEW;
ALL_FUNCTIONS:= ALL_FUNCTIONS $>
[NAME:NAME START_ADDRESS:START] ;
"Count up the number of parameters and the number of locals..."
N_PARAMS:= LENGTH( PARAMS );
N_LOCALS:= LENGTH( LOCALS );
"2) Generate machine language for the function..."
" a) The initial program fragment... "
EQU(START); "Define the start address to be here"
STORE( RETURN, 1 \OFF_OF SP ); "(as shown earlier)"
STORE( FP, 2 \OFF_OF SP );
LOAD( FP, SP );
ADD( SP, ADDRESS_OF( 2 + N_LOCALS ) );
" b) Modify the symbol table so that each parameter and local
is associated with an address expression... "
HOLDING SYMBOL_TABLE; "Our effect on SYMBOL_TABLE is meant to
be temporary"
DO
( FOR T $E PARAMS; && FOR I FROM -N_PARAMS+1 BY 1; )
THEN
( FOR T $E LOCALS; && FOR I FROM 3 BY 1; )
DO
SYMBOL_TABLE:= [NAME:T ADDRESS: I \OFF_OF FP ]
<$ SYMBOL_TABLE;
END
" c) Generate the body... "
<*BODY*>;
"The answer is in register 1, because BODY is the
semantics of an EXPR."
ENDHOLD "(pull back the symbol table to where it was)"
" d) The final program fragment... "
LOAD( RETURN, 1 \OFF_OF FP );
LOAD( FP , 2 \OFF_OF FP );
SUB( SP, ADDRESS_OF( N_PARAMS + N_LOCALS + 2 ) );
JUMP( 0 \OFF_OF RETURN );
END
ENDDEFN
We have now implemented function definition and function calling.
(Step 2b, where we augment the symbol table, we perform:
( FOR T $E PARAMS; && FOR I FROM -N_PARAMS+1 BY 1; )
THEN
( FOR T $E LOCALS; && FOR I FROM 3 BY 1; )
DO
...
END
The first three lines form a quantifier, built out of four "FOR"
quantifiers. The "&&" and "THEN" operators that combine the four
quantifiers are covered in Section 22.3.2.
The first line sets T to each parameter name, and simultaneously sets
I to be the correct displacement off of FP. Then, ~after all those
iterations are complete, the third line sets T to each local variable
name, and simultaneously sets I to be the correct displacement for that
local. The "..." augments the symbol table on each iteration).
BOX: How many elements are pushed onto the stack by the code
BOX: generated by CALL_FUNCTION?
BOX:
BOX: How many instructions does DEFINE_FUNCTION generate
BOX: upon entry to the function, and then upon exit?
BOX:
BOX: How long do we keep the local and parameter values
BOX: in SYMBOL_TABLE?
BOX:
BOX: What earlier program of ours uses SYMBOL_TABLE
BOX: (in Chapter 8)?
BOX:
BOX: Why will the locals and parameters appear to be
BOX: available?
19.5.1
A Slight Modification To Chapter 8's SYMBOL_TABLE
A detail must be noted relative to chapter 8's compiler. The
type of the variable SYMBOL_TABLE was declared by:
TYPE NAME_AND_ADDRESS = [NAME:TEXT ADDRESS:LABEL] ;
SYMBOL_TABLE = { NAME_AND_ADDRESS } ;
We must change the LABEL to now be a MACHINE_ADDRESS_EXPRESSION,
which is capable of representing address expressions like
"I \OFF_OF FP". MACHINE_ADDRESS_EXPRESSION was introduced in Section
9.8.1.
Also, the function:
ADDRESS_OF_VARIABLE( TEXT ) -> LABEL
must now be changed so as to deliver a MACHINE_ADDRESS_EXPRESSION
instead of a LABEL:
ADDRESS_OF_VARIABLE( TEXT ) -> MACHINE_ADDRESS_EXPRESSION
All the other parts in Chapter 8 need no further change.
(The LABELs coming out of ADDRESS_OF_VARIABLE were used ~only as
operands in instructions. Instructions always take in
MACHINE_ADDRESS_EXPRESSIONs, as of Section 9.8.1. Thus, all
uses of ADDRESS_OF_VARIABLE will be automatically upward compatible
with this change).
(We had gotten by supplying merely LABELs because of the two coercions
that let a LABEL be seen also as a MACHINE_ADDRESS_EXPRESSION:
LABEL -> ADDRESS_EXPRESSION "from the definition of
the type ADDRESS_EXPRESSION"
and
ADDRESS_EXPRESSION -> MACHINE_ADDRESS_EXPRESSION
"from Section 9.8"
19.6
Mutual Recursion
It is possible that a function F calls a function G, and that G also
calls F. This relationship is known as ~mutual ~recursion.
Suppose F's definition appears before G's definition. While F's
definition is being processed, G hasn't been seen yet. Therefore,
when F calls G, there will be an error because G is unknown at this
time.
Switching the order of F's and G's definitions won't solve the problem,
as during G's processing, F won't be known yet.
We would therefore like to delay the processing of any function
definition until ~after all function definitions have been seen.
Let's change DEFINE_FUNCTION so that it augments ALL_FUNCTIONS
immediately as it does now, but let's delay the generation of the
function's machine language program. Where DEFINE_FUNCTION now appears
as:
DEFINE DEFINE_FUNCTION( NAME: TEXT PARAMS,LOCALS: {TEXT}
BODY: BASIC_PROCESS ):
BEGIN VAR T = TEXT; I, N_PARAMS, N_LOCALS = INT; START=LABEL;
START:= NEW;
ALL_FUNCTIONS:= ALL_FUNCTIONS $>
[NAME:NAME START_ADDRESS:START] ;
***
END
ENDDEFN
let's change it to be:
DEFINE DEFINE_FUNCTION( NAME: TEXT PARAMS,LOCALS: TEXTS
BODY: BASIC_PROCESS ):
BEGIN VAR START=LABEL;
START:= NEW;
ALL_FUNCTIONS:= ALL_FUNCTIONS $>
[NAME:NAME START_ADDRESS:START] ;
POST_PROCESSING := POST_PROCESSING $>
//[PARAMS;LOCALS;BODY;START;]
BEGIN VAR T=TEXT; N_PARAMS,
N_LOCALS=INT;
***
END
\\ ;
END
ENDDEFN
We've taken everything beyond our augmentation of ALL_FUNCTIONS, and
enclose it within //...\\. This delays its execution. When it will
be invoked, it will generate the machine language program for this
function.
That is, instead of generating machine language now, we enclose that
generation within //...\\, forming an BASIC_PROCESS, and append it to a new list
POST_PROCESSING. This doesn't generate machine language now, and so no
function ~calls will be considered yet.
(We declare POST_PROCESSING to hold such BASIC_PROCESSes by:
VAR POST_PROCESSING = BASIC_PROCESSES ;
Let's require the programmer to write END_OF_PROGRAM so that we know
when the end is reached. We introduce this new rule of grammar, and
have it immediately invoke all the BASIC_PROCESSs on POST_PROCESSING,
via:
END_OF_PROGRAM -> BEGIN VAR S = BASIC_PROCESS ;
FOR S $E POST_PROCESSING;
DO <*S*>; END
END
Thus when END_OF_PROGRAM is seen, we will proceed to generate machine
language for each function. (Each BASIC_PROCESS in POST_PROCESSING generates
the machine language for one function).
Consider our example where F calls G and G calls F. F and G will be
entered onto ALL_FUNCTIONS and POST_PROCESSING by the time
END_OF_PROGRAM is seen. No matter whether F or G appears first on
POST_PROCESSING, both will have their machine language generated in the
context where both F and G are known, on ALL_FUNCTIONS.
This simple change solves the mutual recursion problem. Unfortunately,
PASCAL doesn't solve it; instead it forces the user to double-declare
functions, using FORWARD statements. It is the type BASIC_PROCESS in
ICL makes our easy solution possible.
BOX: What is ~mutual ~recursion?
BOX:
BOX: How do we solve the mutual recursion problem?
BOX:
BOX: What is END_OF_PROGRAM for?
19.7
Using Local Variables For Intermediate Results
To make the whole system including chapter 8 work, we need to
change the <EXPR>-<BOP>-<EXPR> rule and the <EXPR>-<COMPARATOR>-<EXPR>
rule. Let's concentrate on the <EXPR>-<BOP>-<EXPR> rule. Its
original text appears now:
<EXPR[i]: left > <BOP[j]: op > <EXPR[k]: right > ->
IF i =< j & k < j THEN
<EXPR[j]:
HOLDING SPARE_MEMORY:= SPARE_MEMORY + 1; "Allocate spare
bin in memory"
DO
<*RIGHT*>; "Get righthand side into register 1"
STORE(1,SPARE_MEMORY); "Store it in memory"
<*LEFT*>; "Get lefthand side into register 1"
HOLDING RIGHT_OPERAND:= SPARE_MEMORY;
DO <*OP*>; "Operate on register 1 and
SPARE_MEMORY. Answer is in
register 1"
ENDHOLD
ENDHOLD >
FI
This rule uses SPARE_MEMORY to allocate memory for intermediate
results. Each time the expression is executed, it uses that same
memory.
The subtle but devastating problem we wish to avoid is the
reuse of that same memory upon recursive function calls. For example,
the following function F calls itself recursively, from within the
middle of an expression:
DEFINE F( X: REAL ) = REAL :
IF X = 0 THEN 0
ELSE 2*X + F(X-1) FI
ENDDEFN
Concentrate on the ELSE-clause. The expression
2*X + F(X-1)
first computes 2*X and stores this intermediate result into a fixed
memory location, one that was allocated by SPARE_MEMORY while
translating this expression (<EXPR>) into machine language.
Naturally, we expect this intermediate value to be preserved safely
while executing the righthand part of the expression, F(X-1).
Unfortunately, F(X-1) calls this same function, which in its ELSE-
clause may need to evaluate the expression again. Upon this second
incarnation of F, where X is one smaller than in the first
incarnation, 2*X will be computed and stored into the same fixed bin
in memory. This overwrites the data put there by the first incarnation
of F!
Thus, when the second incarnation returns and the first incarnation
resumes execution, it will pick up not the value 2*X that it had
computed, but the 2*(X-1) stored there by the second incaranation of F,
the F(X-1).
For recursive functions like F, it is not acceptable to store
intermediate results into fixed bins in memory. Instead, let's store
all intermediate results on the stack, like we do with local variables.
Let's use SPARE_MEMORY, not as a fixed bin in memory, but as an
offset from the frame pointer, i.e., as in:
SPARE_MEMORY \OFF_OF FP
This address expression using SPARE_MEMORY denotes a local
variable. (Recall that all local variables are represented by address
expressions of the form
I \OFF_OF FP ).
The new spare memory is allocated as a new local variable on the stack,
always relative to FP, as in figure 19.1. This is a safe use of
memory because different incarnations of the same function always have
different values in FP, thus keeping their locals entirely separate.
Here's the updated rule:
<EXPR[i]:left> <BOP[j]:op> <EXPR[j]:right> ->
IF I =< J & K < J THEN
<EXPR[j]:
HOLDING SPARE_MEMORY:= SPARE_MEMORY + 1 ;
DO
<*RIGHT*>;
STORE( 1, SPARE_MEMORY \OFF_OF FP );
<*LEFT*>;
HOLDING RIGHT_OPERAND:=
SPARE_MEMORY \OFF_OF FP ;
DO
<*OP*>;
ENDHOLD
MAXIMUM_LOCAL:= MAXIMUM_LOCAL \MAX
SPARE_MEMORY ;
ENDHOLD
>
FI
This looks very much like the original in chapter 8. Now, we always
use SPARE_MEMORY in the address expression
SPARE_MEMORY \OFF_OF FP
so that the bin of intermediate storage is a local variable. Our
incrementing of SPARE_MEMORY, like before, allocates a new variable,
a local variable, to hold the intermediate value, RIGHT's value.
19.7.1
Keeping Track Of All Locals
We've introduced the following statement at the end:
MAXIMUM_LOCAL:= MAXIMUM_LOCAL \MAX SPARE_MEMORY ;
MAXIMUM_LOCAL records the ~highest value in SPARE_MEMORY ever used,
and thus participates in determining how big the locals section should
be. The value in MAXIMUM_LOCAL will be read ultimately within
DEFINE_FUNCTION, as it increments SP to allocate space for locals.
BOX: Why must we store intermediate results on the stack
BOX: instead of in fixed bins in memory?
BOX:
BOX: What change did we make to get intermediate results
BOX: to reside on the stack?
BOX:
BOX: Why do we use FP instead of SP for local variables'
BOX: address expressions?
19.7.2
The Late Arrival Of MAXIMUM_LOCAL
Unfortunately, MAXIMUM_LOCAL does not have the correct value until
~after the function body is translated into machine language. Only
during the invocation of the body, "<*BODY*>;", will this
<EXPR>-<BOP>-<EXPR> rule's semantics be invoked, and so only then will
MAXIMUM_LOCAL be made up to date.
However, we need MAXIMUM_LOCAL ~prior to the function body. We need
it for generating the initial program fragment, as we increment SP
to allcoate space for ~all locals.
To accomodate the late arrival of MAXIMUM_LOCAL, we can proceed in
either of two ways. First, we could generate the initial program
fragment ~last, as in:
label: the function body
the final program fragment
START: the initial program fragment
JUMP( label );
This late initial fragment is late enough to have had the function
body already generated and hence to have had MAXIMUM_LOCAL updated
completely. That late initial segment ultimately JUMPs backwards to
the new label, where the function body starts.
We must first initialize SPARE_MEMORY and MAXIMUM_LOCAL before we
invoke the BODY:
SPARE_MEMORY:= N_LOCALS + 2 ;
MAXIMUM_LOCAL:= SPARE_MEMORY ;
After the function body is generated (the <*BODY*>;"), MAXIMUM_LOCAL
will tell exactly how much to increment SP so as to contain all locals,
both declared locals (from the LOCALS clause) and the locals allocated
via SPARE_MEMORY.
Finally, our old DEFINE_FUNCTION is rendered up to date by replacing
appearences of N_LOCALS in the ADD and SUB instructions. There,
N_LOCALS always appears in an expression:
N_LOCALS + 2
Replace that expression now by MAXIMUM_LOCAL.
19.7.3
Another Approach
Another option is to leave the initial program fragment up front,
and to create a label that represents the number of locals, e.g.:
"the initial program fragment"
EQU(START);
FINAL_NUMBER_OF_LOCALS:= NEW; "the new label"
STORE( RETURN, 1 \OFF_OF SP ); "unchanged"
STORE( FP, 2 \OFF_OF SP ); "unchanged"
LOAD( FP, SP ); "uncahnged"
ADD( SP , ADDRESS_OF( FINAL_NUMBER_OF_LOCALS ) );
The value of the label FINAL_NUMBER_OF_LOCALS is unknown at this time.
After the invocation of BODY, we perform:
EQU( FINAL_NUMBER_OF_LOCALS, MAXIMUM_LOCAL ) ;
to assign the right value to the label.
BOX: Why does MAXIMUM_LOCAL become up to date late?
BOX:
BOX: Why is SPARE_MEMORY initialized to N_LOCALS+2?
BOX:
BOX: How do we know the MAXIMUM_LOCAL will be big enough to
BOX: account for declared locals, as well as locals
BOX: allocated by the <EXPR>-<BOP>-<EXPR> rule?
19.8
For Debugging Aid
It may be that during program execution you will need to discover the
name of the function you are in. This is facilitated by pushing one
more item onto the stack upon function entry. Figure 19.2 shows an
extra bin pushed, the ~name of the function.
Now, if you are executing inside a function and you want to know the
function's name, the address expression "3 \OFF_OF FP" hold the
pointer to this function's name.
Often, you want to find out not only what function you're in, but also
which function called this function, and even what function called that
one. This is all meant to answer the question "How did I get here?"
Figure 19.3(a) shows the stack after F1 calls F2. There are two
frames here. One for F1, the outer function, and one for F2, the inner
function. The bin containing old FP appears here as a pointer back to
the previous frame.
Figure 19.3(b) shows three function calls: F1 calls F2 which calls F3.
Starting at FP, you find the name of the most recent function, F3,
via:
LOAD( 1, 3 \OFF_OF FP ); "Register 1 references the
function name, 'F3'"
Similarly, you find the next outer function by first acquiring the old
FP, via:
LOAD( 1, 2 \OFF_OF FP ); "old FP"
and then acquiring that function's name, F2, via:
LOAD( 1, 3 \OFF_OF 1 ); "Register 1 references the name
F2"
Finally, you acquire the name of the third outer function via:
LOAD( 1, 2 \OFF_OF FP ); "old FP"
LOAD( 1, 2 \OFF_OF 1 ); "older FP"
LOAD( 1, 3 \OFF_OF 1 ); "the name 'F3'"
You can now answer the question "How did I get here?" by providing a
~backtrace. A backtrace is a list of the function names, starting with
the most recently called function, followed by the next outer function
that called this one, etc.
In reverse order, the backtrace tells you what function called what
function, etc., all the way until it ends with the function in which
you are executing presently. Backtraces are essential during
debugging, because "How did I get here?" is the most popular question!
BOX: What is a ~backtrace?
BOX:
BOX: How does one stack frame point to the previous one?
BOX:
BOX: How many instructions would it take to get the name
BOX: of the N'th function in the backtrace?
BOX:
BOX: Can you imagine a way by which the function name could
BOX: be acquired ~without any extra instructions, i.e.,
BOX: without pushing that extra pointer onto the stack?
19.9
Another Implementation Of Stack Frames
Often, greater efficiency in function calling can be gained by another
method of managing stack frames. This method is used in the MIPS
compilers []. We omit the FP altogether, and relieve the original
need for it. We needed FP only because SP could change, due to pushes
onto the stack. Let's now insist that SP never changes, that is, we
will forbid the direct use of pushes and pops.
Figure 19.4 shows our stack frame, now without FP and without any
storage for old FP. We increment SP so as to allocate not only locals,
but some additional space as well, where we will put the parameters for
functions we call.
Instead of pushing parameters onto the stack, we will store them
directly into the locations:
0 \OFF_OF SP for the last parameter
-1 \OFF_OF SP for the second to last parameter
...
-K+1 \OFF_OF SP for the first parameter in K parameters
Thus, to the function we call, it still looks as though its parameters
were pushed onto the stack.
19.9.1
Watch Out For Nested Function Calls
However, we must be careful how we "push" parameters onto the stack.
Figure 19.5(a) shows what we expect the stack to look like prior to
issuing the CALL instruction, if we're passing in two parameters
param1 and param2.
Imagine that we store param1 there first. The subsequent computation
of param2 may itself need to call a function, e.g., as in:
F( X , G(Y,Z) )
(The function G must be called to deliver the second parameter for F).
Unfortunately, that computation (G) will use the ~same parameter
space. Figure 19.5(b) shows that in calling G, those two final bins
in the parameter space will be overwritten, ~killing ~our ~original
~param1!
It is no longer acceptable to store parameters directly into parameter
space as each is computed. All parameters, except the last, must be
stored elsewhere, somewhere in the locals space.
With the original method, the stack top (SP) would grow dynamically,
which assured that one could put parameters right where one wanted
them, on the top of the stack. The evaluation of other parameters
would increment SP, and hence use a ~higher part of the stack, leaving
our original parameters unscathed in their final positions on the
stack. The value of SP might be different at each call.
With the new method, time is saved by no longer having to save and
restore FP, although this savings may be consumed in the extra
movement of parameters. For functions having two or less parameters,
only the first parameter need be moved, at an expense of one LOAD and
one STORE. In contrast, omitting FP saves us three instructions (the
instructions involving FP).
Thus if a majority of your functions have two or fewer parameters,
this latest method for treating stack frames is the most efficient.
Generally, the majority of function calls do involve two or less
parameters.
BOX: In using the new conventions, what instructions can
BOX: be omitted from the initial and final program
BOX: fragments?
BOX:
BOX: Why can we omit FP altogether?
BOX:
BOX: What extra complication arises with this method?
BOX:
BOX: Which set of conventions is most efficient for
BOX: functions having many parameters?
19.10
Passing Parameters By ~Value, By ~Name, And By ~Reference
The passing of parameters into functions can be subject to different
interpretations. For example, suppose we define F as follows:
DEFINE F( X: REAL ) :
BEGIN VAR Y,Z=REAL;
Z:= 0;
FOR Y FROM 1 TO 5; DO Z:= Z+X ; END
WRITE(Z);
END
ENDDEFN
F finally writes the value in Z. Suppose we call F via:
F( Y+1 )
19.10.1
Passing By Value
The most intuitive interpretation is called passing by ~value.
We've adhered throughout this text to this form of parameter passing.
If Y holds presently a 1, then our function call:
F(Y+1)
will call F with the value 2. When F writes Z, we will see a 10,
because Z is the sum of X made 5 times, and X containes the ~value 2.
F's parameter X is substituted with its value, a 2, as in:
FOR Y FROM 1 TO 5; DO Z:= Z+2 ; END
19.10.2
Passing By Name
A very different interpretation, called passing by ~name, would have F
write the number 20. Passing by name means that we pass in not the
value of Y+1, a 2, but instead, the expression "Y+1" itself. Inside of
F, the parameter X is substituted not by the value 2, but by the
expression "Y+1".
The FOR-loop, with this substitution, reads as:
FOR Y FROM 1 TO 5; DO Z:= Z + Y+1 ; END
This adds together the numbers Y+1 where Y takes on the values 1 thru
5:
1+1 + 2+1 + 3+1 + 4+1 + 5+1
Passing by name thus passes programs, e.g., "Y+1".
One disasterous consequence of passing by name is the dependence of
parameters, e.g., "Y+1", on local variables within the function, e.g.,
F's local variable Y. The disaster occurs when the author of F changes
F, perhaps deleting or renaming the local variable Y.
A safe way to achieve the effects of passing by name is offered in ICL.
ICL passes parameters only by value, but that value can itself be a
program.
Programs passed around as data are called processes in ICL, and are
covered in Section 23.6. They provide an airtight way to pass
programs. The process passed in ("Y+1") has no access to the local
variable Y. Instead, the function F ~explicitly passes data into the
given process.
19.10.3
Symmetry Between Parameter Passage And Assignment Statements
One might ask "If I can pass a parameter by name, why can't I ~assign
by name into a variable?" That is, if we can pass Y+1 by name, why
can't we perform:
V:= Y+1 ;
and have V represent the program "Y+1" from here after?
This symmetry between parameter passing and assignment statements is
preserved in ICL, as processes can be assigned into variables as well
as being passed in as parameters.
19.10.4
Pass By Reference
Another interpretation for what passing parameters means is exposed in
this example:
DEFINE F( X: REAL ):
X:= 5;
END
We are assigning a value into our parameter.
Let's call F, as in:
F(Z) ;
Does this call have the side effect of setting Z to 5? If we're
passing by value, the answer is no. If we're passing by reference, the
answer is yes.
ICL supresses this kind of backwards transmission of information made
possible by pass-by-reference, to minimize side-effects. However,
ICL's objective modification (the "@" in Chapter 10, or Section 22.1.11)
can be used to achieve the same effect. You pass in a block of memory
and have the function write directly into it, via objective
modification.
This concept of objective modification exists independently of
parameter passing. Thus, the symmetry between parameter passing and
assignment statements continues to be maintained in ICL.
BOX: What is meant by ~pass-by-value, ~pass-by-reference,
BOX: and ~pass-by-name?
BOX:
BOX: How can ~pass-by-name destroy an otherwise airtight
BOX: separations between the insides and outsides of a
BOX: function?