Work Note 6, DM516, Spring 2012

Lecture February 16

Code generation.
Introduction to optimization.
Peep-hole optimization.

Background material: Appel Chapters 7, 8, 9. In this course, we have a slightly different focus from that in the textbook. As a consequence, we will take a simpler and more direct approach to code generation. Thus, it is recommended that you merely skim over the material in the book and read later to the extent that you need the material.

Exercises February 28

Consider code generation templates for the following constructions:
- for-loops (old Algol/Pascal style): for i:=7 to 42 do SOMETHING.
- for-loops (C style).
- loops with break/continue. The loop is as a starting point infinite and starts with the keyword loop without any condition. Inside the body of the loop, break exits the loop and continue starts from the beginning. For both break and continue, this is independent of where in the loop code you are.
- switch (C) and case (Pascal) constructions.
- records.
- arrays.
- multi-dimensionel arrays; this is different from arrays of arrays. You must find a layout such that you can efficiently compute the address of A[i,j,k], for instance. Thus, it must be different from following three pointers as you would logically do if you used A[i][j][k].
- conditional expressions in C style: ( exp ? exp : exp ).
What is the semantics of i:= 7; for i:=i+1 to i+2 do {i:=i+3; print i} according to your template above, i.e., what is printed? What are the reasonable behaviors?
What should be the behavior of the following three pieces of C-code:
- x = 3; r = x++ * x++ * x++;
- x = 3; r = ++x * ++x * ++x;
- y = 0; r = y++ + 2*y++ + 3*y++ + 4*y++;
Try it! You might be surprised...
Are there efficiency reasons to restrict switch/case expressions to be simple types and only use values from a small domain?
Some programming languages offer lists with random access (A[i]), append (A.append(42)), and insert (A.insert(pos,val)) which inserts val in between the positions pos and pos+1. Can this be implemented efficiently?
Consider constructions which are typically allocated in the heap, e.g., C structs. How can they give rise to (enormous) vaste of space? How is it possible in principle to check if allocated space is still used and thereby "collect" unused space for reuse?
Consider possible targets for optimization other than time and space and describe plausible goals.
Try to find patterns which could be used in a high-level peep-hole optimization of for instance C-programs. Are there any possible dangers in writing a high-level optimizer for C-programs where simple but inefficient code is replaced by more complicated but also more efficient code?
Take a look at code generated by gcc (compile with gcc -S) for a few well-known and relatively short source files, such as factorial, for instance. Try to investigate the effect of the three options -O1, -O2, and -O3 with regards to code optimization.
Assume that you have (structured) Intel Pentium code in a linked list of structs. Thus, entries are typically op-fields, parameter-fields, and possibly a pointer in connection with jump-instructions (therefore the term "structured"). Write the C-code which recognizes assembler code of the nature "x = x + 1" and replaces it by the appropriate "inc x" equivalent code.
Find more patterns in the categories from the lecture or from new catagories that you come up with.

Last modified: Thu Feb 9 11:22:43 CET 2012

Kim Skak Larsen (kslarsen@imada.sdu.dk)