CHAPTER 34


				VARIABLE-SIZED MEMORY MANAGEMENT



	  We consider automatic memory management for blocks of different sizes.
	  This form of memory management is more expensive than that for fixed-
	  sized blocks.  Ideally, both forms of memory management would be
	  present.	Very small blocks would come from fixed-sized block space.
	  ~Lists are ideally implemented by our 8-byte fixed-sized blocks.
	  Records and arrays on the other hand, with their random accessiblity,
	  would be implemented by larger blocks, from the variable-sized block
	  space


34.1
Non-Copying Memory Management

	  One could extend our fixed-sized memory management to apply to
	  variable-sized blocks.  Imagine a free list consisting of blocks of
	  different sizes.  The added complexity here occurs when one wants a
	  new block of some given size.  The free list must now be ~searched
	  in order to find a block of the desired size.	 This can be expensive
	  when compared to our old way of taking always the first block off the
	  free list.

	

	  In searching the free list, one could procede in two ways.  You can
	  search either for a block whose size matches exactly the desired size,
	  or stop as soon as any block large enough is found.	 If an exact size
	  match can't be found, then we are stuck using a bigger block (figure
	34.1(a)).

	The remainder of the bigger block, after claiming the portion of
	desired size, becomes a smaller block that goes back onto the free
	list.  Upon repeated requests for blocks, the free list may wind up
	containing many small blocks, most of which might be too small to
	satisfy subsequent requests for blocks.  The proliferation of small
	blocks leaves memory ~fragmented (figuve 34.1(b)).

	The fragmentation of memory can render the following situation.  A
	block allocation can fail by not finding any block big enough.  Yet,
	not all of memory is in use.  The fragments are unusable.

	We won't pursue this scheme.


34.2
Copying Memory Management

	  To rid ourselves of fragmentation, we can envision ~moving blocks
	  closer together, so as to eliminate all fragments.	This moving is
	  refered to as ~copying.

	  As in our simplest memory management scheme, we always allocate a new
	  block off the top of memory.  This way, we get a block exactly of the
	  desired size.  When the end of memory is reached, we perform garbage
	  collection.  As before, we ~mark all blocks in use by following all
	  pointers.	 Once the used blocks are detected, we move all blocks
	  downward in memory, so that all blocks in use are compressed into a
	  continguous chunk at the bottom of memory.

	  This leaves one contiguous chunk of free space at the top of memory.
	  We then allocate blocks from there, as before.  No free
	  list is required, just a pointer to the first bin of the free space.

	  The two new expenses in this new scheme are the ~copying, and the
	  ~translation of addresses.	The copying is present of course to avoid
	  fragmentation.	Once blocks have been moved, existing pointers to the
	  blocks cease to be correct.	 A pointer pointing to a given block must
	  be changed so as to continue to point to the block at its new
	  residence.

		    BOX:	What's the difference between copying and non-copying
		BOX:	garbage collection?
		BOX:
		BOX:	What advantages does copying offer?
		BOX:
		BOX:	What expenses are new, above and beyond our original
		BOX:	fixed-sized block garbage collection cost?


34.2.1
A Format For Variable Sized Blocks

	

	We introduce two bins of overhead into each block (figure 34.2(a)).
	The first bin will be used to assist in the translation of pointers,
	which is required of course after moving the blocks.  This bin is
	called the ~forwarding ~pointer.  We will return to this bin later.

	The second bin holds basically the ~size of the block.  The size is
	used when we ultimately move the block.  It also indicates where in
	memory the next block, beyond ours, resides.

	There are three different representations for the second bin
	(figure 34.2(b)).  We have to know which bins in the block are pointers
	and which are not.  We need this when MARK follows all ~pointers to
	discover all blocks in use.

	The first format, the "00" case, indicates that there are ~no pointers
	in the block.  The second format, the "01" case, indicates that ~all
	bins in the block are pointers.  These two forms implement ~arrays,
	where all elements (bins) in the array are of the same type.  The
	~element ~type of the array dictates which of these formats are used.
	The distinction here is similar to that made between our P-block and
	D-block.

	The third format is for blocks that contain both pointers and non-
	pointers.  First of all, we insist that all pointers appear ~first,
	and all non-pointers appear second.  We call these two sections the
	~pointer ~section and the ~non-pointer ~section.  The format of the
	second bin, the size bin, is the third shown.  It contains the size
	of the pointer section and the total size of the block.

	Blocks that include both pointers and non-pointers are useful for
	~records, where some components of the record are of types represented
	by pointers and others are of non-pointer types.  The compiler, behind
	the programmer's back, can vary the order of the components so that
	  all pointer components appear first.

	  The main use of this mixed-mode block is records.  The total number of
	  bins in the block is small.	 It is merely the number of components in
	  the record.  This relatively small size allows us to represent the size
	  in only half a bin (say 15 bits).	 Records having more than 32000
	  components are unheard of.

	  Keep in mind that all pointers to blocks point to the ~first ~bin in
	  the block.


34.3
The Four Phases Of Variable-Sized Block Memory Management

	  Although disk-residency can easily be made to work with variable-sized
	  blocks, we ignore the disk for now.  The following thus replaces the
	  ~mark and the ~sweep phases.  It may be easiest to think of this
	  endeavor as extending our simple garbage collection method, which had
	  only the mark and sweep phases.

	  The four phases required for variable-sized memory management are:

	     1)   ~Mark all reachable blocks.
	     2)   ~Plan the whereabouts of blocks' new residences
	   3)	~Translate all pointers
	   4)	~Move blocks to their new residences.

	The first phase is familiar.


34.3.1
Phase 1 - Marking

	We mark all reachable blocks like we did before.  Now, however, some
	pointers can point to variable-sized blocks.  We mark a variable-
	sized block exactly as we mark any other block.  (The mark bit resides
	in one of the two overhead bins).  Upon marking a variable-sized
	block, we go on to mark all blocks referenced by pointers in that
	block.  The second overhead bin tells us which bins in the block are
	pointers.  We follow those.


34.3.2
Phase 2 - Planning New Residences

	We now pretend to move the blocks.  Actually, we merely plan where
	each block will reside after compaction (moving).

	

	Now that we know the subset of blocks in use (figure 34.3(a)), we can
	imagine their compaction (figure 34.3(b)).  One simply scans the set
	of used blocks, from the bottom to top of memory.  We maintain a
	special pointer that starts off at the bottom of memory.  Upon seeing
	each used block, we increment that special pointer by the size of the
	block.  That pointer indicates the new residence for the ~next used
	block.

	We make each block's forwarding pointer point to its new residence
	  (figure 34.3(c)).  This will assist us later, when we translate old
	  pointers into new pointers.	 One simply takes in a (old) pointer to a
	  block, fetches that block's forwarding pointer, and voila, we have the
	new pointer, pointing to where the this block ~will reside.


34.3.3
Phase 3 - Pointer Translation

	We scan once again all used blocks, both fixed-sized and variable-
	sized used blocks.  We scan linearly from the bottom of memory to the
	top.  We translate each pointer contained in each block.  This is done
	by fetching the forwarding pointer, which is found where the given
	pointer points.  We stuff this new pointer back into the same bin from
	which we got the old pointer.  This translates all pointers so as to
	point now to the referenced block's new address of residence.

	  For pointers to fixed-sized blocks, which reside in a different space
	  from variable sized blocks, no translation occurs.

	  During this scanning of fixed-sized block space, one can implement at
	  no expense the sweep phase.	 In common to both the translation of
	  pointers and the sweep phase is the need to scan fixed-block space
	  linearly from bottom to top.  Encountered unused fixed-sized blocks go
	  onto the free list, and encountered used blocks get their pointers (to
	  variable-sized blocks) translated.


34.3.4
Phase 4 - The Moving

	  Now that the pointers in each block are consistent with where the
	  blocks ~will reside, we can now move all blocks to their new
	  residences.  We've had to wait this long before moving the blocks
	because the previous phase relied on forwarding pointers.  Those
	forwarding pointers, which reside where old pointers point, had to be
	kept intact.  Once we move the blocks downward in memory, we will be
	writing over those forwarding pointers.

	The separate, translation phase had to finish before performing any
	block movement.


34.4
The Cost

	Of the four phases, the first and third correspond to the mark and
	sweep phases of fixed-sized garbage collection.  (As indicated earlier,
	the sweep for fixed-sized blocks can be implemented at no expense as
	part of the translation phase).  The new costs are the pointer
	translation and the final moving of blocks.

	

	An interesting phenomenon occurs.  Since blocks move only downward in
	memory, blocks that have been in use for a long time tend to reside
	at the bottom of memory.  It is often the case the very old blocks
	tend to stay in use, whereas a majority of young blocks tend to be
	freed soon after their creation.  Thus, the bottom of memory tends
	to hold only a few free blocks (figure 34.4).

	To reduce the cost of moving, one could leave unmoved blocks lower
	in memory.  As one scans memory from bottom to top, one can keep track
	of the percentage of that memory that is free.  As long as free memory
	makes up only a small fraction, moving can be avoided, at a cost of
	~losing only a small part of free memory.  Of course, during the
	scan from bottom to top, when the percentage free space scanned so
	far becomes significant, then it pays to start moving blocks.

	This optimization occurs during the planning phase (phase 2).


		BOX:	What are the four phases of variable-sized GC?
		BOX:
		BOX:	Why must all moving be delayed until after all pointer
		BOX:	translation is done?
		BOX:
		BOX:	Where in memory do older block congregate?