Phases of typical compiler and position of code generation.
<fig: 9.1 - page 513>
Since code generation is an "undecidable problem (mathematically speaking), we must be content with heuristic technique that generate "good" code (not necessarily optimal code).
Code generation must do following things:
Code generator concern with:
1. Memory Management
Mapping names in the source program to address of data object is cooperating
done in pass 1 (Front end) and pass 2 (code generator).
Quadruples → address Instruction.
Local variables (local to functions or procedures ) are stack-allocated in the activation record while global variables are in a static area.
2. Instruction Selection
The nature of instruction set of the target machine determines selection.
-"Easy" if instruction set is regular that is uniform and
complete.
Uniform: all triple addresses
all stack single addresses.
Complete: use all register for any operation.
If we don't care about efficiency of target program, instruction selection is straight forward.
For example, the address code is:
a := b + c
d := a + e
Inefficient assembly code is:
Here the fourth statement is redundant, and so is the third statement if 'a' is not subsequently used.
3. Register Allocation
Register can be accessed faster than memory words. Frequently accessed
variables should reside in registers (register allocation). Register assignment
is picking a specific register for each such variable.
Formally, there are two steps in register allocation:
Some of the issues that complicate register allocation (problem).
1. Special use of hardware for example, some instructions require specific register.
2. Convention for Software:
For example
3. Choice of Evaluation order
Changing the order of evaluation may produce more efficient code.
This is NP-complete problem but we can bypass this hindrance by generating code
for quadruples in the order in which they have been produced by intermediate
code generator.
ADD x, Y, T1
ADD a, b, T2
is legal because X, Y and a, b are different (not dependent).
Familiarity with the target machine and its instruction set is a prerequisite for designing a good code generator.
Typical Architecture
Target machine is:
Typical Architecture:
MODE | FORM | ADDRESS | EXAMPLE | ADDED-COST |
Absolute | M | M | ADD R0, R1 | 1 |
register | R | R | ADD temp, R1 | 0 |
Index | c (R) | c + contents (R) | ADD 100(R2), R1 | 1 |
Indirect register | *R | contents (R) | ADD * R2, R1 | 0 |
Indirect Index | *c (R) | contents (c + contents (R) | ADD * 100(R2), R1 | 1 |
Literal | # c | constant c | ADD # 3, R1 | 1 |
Instruction costs:
Each instruction has a cost of 1 plus added costs for the source and destination.
=> cost of instruction = 1 + cost associated the source and destination address mode.
This cost corresponds to the length (in words ) of instruction.
Examples