Work Note 6, DM516, Spring 2012
Lecture February 16
-
Code generation.
-
Introduction to optimization.
-
Peep-hole optimization.
Background material: Appel Chapters 7, 8, 9.
In this course, we have a slightly different focus from that in
the textbook. As a consequence, we will take a simpler and more direct
approach to code generation. Thus, it is recommended that you
merely skim over the material in the book and read later
to the extent that you need the material.
Exercises February 28
-
Consider code generation templates for the following constructions:
-
for-loops (old Algol/Pascal style):
for i:=7 to 42 do SOMETHING
.
-
for-loops (C style).
-
loops with break/continue.
The loop is as a starting point infinite and starts with the keyword
loop without any condition.
Inside the body of the loop,
break exits the loop and continue starts from the beginning.
For both break and continue,
this is independent of where in the loop code you are.
-
switch (C) and case (Pascal) constructions.
-
records.
-
arrays.
-
multi-dimensionel arrays; this is different from arrays of arrays.
You must find a layout such that you can efficiently compute
the address of A[i,j,k], for instance. Thus, it must be different
from following three pointers as you would logically do if you used
A[i][j][k].
-
conditional expressions in C style: ( exp ? exp : exp ).
-
What is the semantics of
i:= 7; for i:=i+1 to i+2 do {i:=i+3; print i}
according to your template above, i.e., what is printed?
What are the reasonable behaviors?
-
What should be the behavior of the following three pieces of C-code:
-
x = 3; r = x++ * x++ * x++;
-
x = 3; r = ++x * ++x * ++x;
- y = 0; r = y++ + 2*y++ + 3*y++ + 4*y++;
Try it! You might be surprised...
-
Are there efficiency reasons to restrict switch/case expressions
to be simple types and only use values from a small domain?
-
Some programming languages offer lists with random access
(
A[i]
), append (A.append(42)
), and insert
(A.insert(pos,val)
) which inserts val
in between the positions pos
and pos+1
.
Can this be implemented efficiently?
-
Consider constructions which are typically allocated in the heap,
e.g., C structs.
How can they give rise to (enormous) vaste of space?
How is it possible in principle to check if allocated space
is still used and thereby "collect" unused space for reuse?
-
Consider possible targets for optimization other than time and space
and describe plausible goals.
-
Try to find patterns which could be used in a high-level
peep-hole optimization of for instance C-programs.
Are there any possible dangers in writing a high-level
optimizer for C-programs where simple but inefficient code
is replaced by more complicated but also more efficient code?
-
Take a look at code generated by gcc (compile with gcc -S)
for a few well-known and relatively short source files,
such as factorial, for instance.
Try to investigate the effect of the three options
-O1
, -O2
, and -O3
with regards to code optimization.
-
Assume that you have (structured) Intel Pentium code
in a linked list of structs. Thus, entries are typically op-fields,
parameter-fields, and possibly a pointer in connection with
jump
-instructions (therefore the term "structured").
Write the C-code which recognizes assembler code
of the nature "x = x + 1
" and replaces it by the
appropriate "inc x
" equivalent code.
-
Find more patterns in the categories from the lecture
or from new catagories that you come up with.
Last modified: Thu Feb 9 11:22:43 CET 2012
Kim Skak Larsen
(kslarsen@imada.sdu.dk)