 We discuss the following optimizations in this chapter:
 Constant expression evaluation (constant folding).
 Scalar replacement of aggregates.
 Algebraic simplifications and reassociation.
 Value numbering.
 Copy propagation.
 Sparse conditional constant propagation.
 The first three of these can be done without regard to data flow
analysis while the last three need data flow information.
 Constant folding is computing, at compile time, the value expressions
involving only constants.
 This transformation is always applicable for booleans and almost
always for integers.
 When wouldn't it be applicable for integers? (When the result of the
evaluation would result in some sort of exception (e.g., divide by zero or
overflow)).
 In the case of addressing (pointer) arithmetic, constant folding is
always worthwhile and safe; overflow doesn't matter.
 The algorithm a straightforward; look at an instruction and see if its
operands are both constants.
 As for all transformations independent of data flow, the effectiveness
can be increased by combining them with data flow dependent transformations,
especially constant propagation.
 Constant expression evaluation (constant folding) is best structured
as a subroutine that can be invoked whenever needed in an optimizer.
 Scalar replacement of aggregates works by determining which aggregate
components in a procedure have simple scalar values, such that both the
components and the overall aggregates are provably not aliases, and then
assigning them to temporaries whose types match those of the components.
 This makes other optimizations applicable to components of aggregates.
 This optimization is done in relatively few compilers.
 Components thus become candidates for register allocation, constant
and copy propagation, and other optimizations that apply to scalars.
 To perform the optimization we do the following:
 Divide each structure into a series of distinct variables.
 Perform the usual optimizations, particularly constant and copy
propagation.
 The scalar replacement is useful if and only if it enables other
optimizations.
 This optimization is particularly useful for programs that operate on
complex numbers, that are typically represented as records containing pairs
of real numbers.
 As an example, on a particular benchmark that used complex numbers,
adding scalar replacement to the other optimizations resulted in an
additional 15% reduction in execution time.
 Algebraic simplifications use algebraic properties of operators or
particular operatoroperand combinations to simplify expressions.
 Reassociation refers to using specific algebraic properties 
namely associativity, commutativity, and distributivity  to divide an
expression into parts that are constant, loop invariant, and variable.
 Like constant folding, algebraic simplifications and reassociation are
best structured in a compiler as a subroutine, called from any other phase
that can make use of it.
 Consider the following examples of patterns that would be considered
by this optimization:

i + 0 = 0 + i = i  0 = i

0  i = i

i * 1 = 1 * i = i / 1 = i

i * 0 = 0 * i = 0
 (i) = i

i + (j) = i  j

b  true = true  b = true

b  false = false  b = b


assuming the length in bits is
 Some simplifications can be viewed as strength reduction.


2 * i = i + i
 Multiplications by small constant can frequently be done faster by
sequences of shifts and adds. For example, i * 5 is computed by
t < i shl 2
t < t + i
and i * 7 by
t < i shl 3
t < t  i
 The above techniques are most effective if applied during code
generation.
 Another class of simplification involves the use of commutativity and
associativity.
 Recognizing applicable algebraic simplifications is itself simplified
by canonicalization. This means the expression is put in the form constants
first and variables second (if it can be done).
 This nearly halves the number of cases that need to be checked.
 Algebraic simplification and reassociation of addressing
expressions is a special case in that overflow makes no difference in
address computation.
 It may enable constant valued expressions that occur in addressing
computations to be evaluated at compile time, loop invariant expressions to
be enlarged and simplified, and strength reduction to be applied to larger
components of address computations.
 The general strategy of simplifying addressing expressions is
canonicalization, i.e., turning them into sums of products and then
applying commutativity to collect the constant valued and loop invariant
parts together.
 For example, let's assume we have a two dimensional array. The
address of a[i][j], which is contained inside a loop with index
j is:
base_a + ((i  lo1) * (hi2  lo2 + 1) + j  lo2) * w
where:
 base_a is the address of the first element of the array.
 w is the size, in bytes, of objects of the base type of the
array.
 Reassociating the addressing expression to cluster the constant parts
at the left end gives us
 (lo1 * (hi2  lo2 + 1)  lo2) * w + base_a
+ (hi2  lo2 + 1) * i * w + j * w
and all of
 (lo1 * (hi2  lo2 + 1)  lo2) * w
can be computed at compile time, while most the rest
base_a + (hi2  lo2 + 1) * i * w
is loop invariant and can be computed once before the loop is entered,
leaving only the j * w part to be computed and added during each
iteration.
 The book shows a number of transformations that can be done on
addressing expressions. We won't look at them.
 Using these transformations, it is possible to greatly simplify
addressing expressions.
 Floating point expressions are problems. Why? (Some of the
properties of integer don't hold.)
 Many numerical algorithms apply operations in a careful manner. For
example, the smallest number are usually added first to a sum so the large
values don't reduce the significance of the small values.
 Moving things around could cause problems with the results.
 Thus, little reassociation is done for floating point expressions.
 Value numbering is one of several methods for determining that
two computations are equivalent and eliminating one of them.
 It associates a symbolic value with each computation without
interpreting the operation performed, but in such a way that any two
computations with the same symbolic value always computes the same value.
 Even though it may appear to be, the following three transformations
are not equivalent:
 Value numbering.
 Copy propagation.
 Common subexpression elimination.
 We look at value numbering on basic blocks, but it has been extended
to work on extended basic blocks and even to entire procedures.
 We use hashing to partition the expressions that are computed into
classes. Each element in the hash table is a class.
 When we find an expression, we do the following:
 Compute its hash value.
 If it is not already in the sequence of expressions with that hash
value, add it to the sequence.
 If the expression computation occurs in an instruction that is not
an assignment, we split it into two instructions, the first of which
computes the expressions and stores its value in a new temporary and the
second of which uses the temporary in place of the expression.
 If the expression is already in the sequence, we replace the current
computation by a use of the left hand variable in the instruction
represented in the sequence.
 The hash function and the expression matching function are defined to
take commutativity of the operator into account.
 Let us consider the code sequence shown in Figure 8.1(a).
Figure 8.1:
Example of value numbering

Figure 8.1(b) show value number after the first three
instructions. Figure 8.1(c) is the final result after value
numbering.
 It is left to the reader to consider the algorithm and also global
value numbering.
 Copy propagation is a transformation that, given an assignment
x < y, for variables x and y, replaces later uses of
x with uses of y, as long as intervening instructions have not
changed the value of either x or y.
 The structure of a procedure is represented as an array of basic
blocks, each of which is an array of instructions.
 We must consider the relationship between copy propagation and
register coalescing.
 They are identical in their effect, as long as optimization is done
on the low level intermediate code with registers in place of identifiers.
 The method used to determine if the optimization applies is
different:
 Data flow analysis is used for copy propagation.
 The interference graph is used for register coalescing.
 Copy propagation can be performed on intermediate code at any level
from high to low. This is not true of register coalescing.
 For example, consider the flow graph in Figure 8.2(a).
Figure 8.2:
Example of copy propagation

 Consider how we get from Figure 8.2(a) to
Figure 8.2(b).
 This might not seem like much, but b is useless and the
assignment b < a can be removed by dead code elimination.
 Within a basic block, you need to maintain an ACP (active copy
instructions) table.
 Consider the following basic block and perform copy propagation.
Position 
Code Before 
ACP 
Code After 




1 
b < a 

b < a 




2 
c < b + 1 

c < a + 1 




3 
d < b 

d < a 




4 
b < d + c 

b < a + c 




5 
b < d 

b < a 




 To perform global copy propagation, we first do a data flow analysis
to determine which copy assignments reach uses of their left hand variables
unimpaired (i.e., without having either variable redefined in between).
 We need two local sets (to each basic block):
 COPY(i)  instances of copy assignments occurring in block
i that reach the end of block i. COPY(i) is a set of
quadruples
,
such that u < v is a
copy assignment and pos is the position in block i where the
assignment occurs, and neither u nor v is assigned to later in
the block.
 KILL(i)  the set of copy assignment instances killed by
block i, i.e., KILL(i) is the set of quadruples
such that u < v is a copy assignment occurring at
position pos in block .
 Next, we define data flow equations for CPin(i) and
CPout(i) that represent the set of copy assignments that are available for
copy propagation on entry to and exit from block i, respectively.
 A copy assignment is available on entry to block i if it is
available on exit from all predecessors of block i.
 The data flow equations are:
and the proper initialization is
and
CPin(i) = U for all
,
and U is the universal
set of quadruples or at least
 As an example, consider the control flow graph shown in
Figure 8.3.
Figure 8.3:
Another example of copy propagation

 The initial values for COPY and KILL are shown below.
Block 
COPY 
KILL 
entry 


B1 


B2 


B3 


B4 


B5 


B6 


exit 


 Performing the analysis, we get the following results for CPin.
Block 
CPin 
entry 

B1 

B2 

B3 

B4 

B5 

B6 

exit 

 Given the data flow information CPin() and assuming that we have
already done local copy propagation, we perform global propagation as
follows:
 1.
 For each basic block B, set
.
 2.
 For each basic block B, perform the local copy propagation
algorithm on block B (omitting the assignment
.
 Doing local and global copy propagation in Figure 8.3 we
get Figure 8.4.
Figure 8.4:
Flow graph after copy propagation

 Constant propagation is a transformation that, given an
assignment x < c for a variable x and a constant c,
replaces later uses of x with c as long as intervening
assignments have not changed the value of x.
 Consider the control flow graph shown in Figure 8.5(a).
After constant propagation is performed we get the control flow graph shown
in Figure 8.5(b).
Figure 8.5:
Constant propagation example

 Note that all occurrences of b have been replace by 3 but
neither of the resulting constant valued expressions have been evaluated.
This is done by constant expressions evaluation.
 Advantages of constant propagation:
 Constant propagation is particularly important for RISC
architectures because it moves small integer constants to the places where
they are used.
 Since all RISCs provide instructions that take a small
integer constant as an operand, knowing that an operand is such a constant
allows more efficient code to be generated.
 Some RISCs have an addressing mode that uses the sum of a
register and a small constant but not one that uses the sum of two
registers; propagating a small constant value to such an address
construction save both registers and instructions.
 Constant propagation reduces the number of registers needed by a
procedure and increase the effectiveness of several other optimizations,
such as constant expression evaluation, induction variable optimizations,
and the dependence analysis based transformations.
 We are not going to look at the specific algorithm used for this
transformation.
This document was generated using the
LaTeX2HTML translator Version 98.1 release (February 19th, 1998)
Copyright © 1993, 1994, 1995, 1996, 1997,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html split 0 earlyd.
The translation was initiated by Steve Allan on 20000211
Steve Allan
20000211