## 电子工程代写|编译器代写Compilers代考|The Science of Code Optimization

The term “optimization” in compiler design refers to the attempts that a compiler makes to produce co de that is more efficient than the obvious code. ” $\mathrm{O}_{\mathrm{p}}$ timization” is thus a misnomer, since there is no way that the code produced by a compiler can be guaranteed to be as fast or faster than any other code that performs the same task.

In modern times, the optimization of code that a compiler performs has be come both more important and more complex. It is more complex because pro cessor ar chitectures have become more complex, yielding more opportunities to improve the way code executes. It is more import ant be cause massively parallel computers require substantial optimization, or their performance suffers by orders of magnitude. With the likely prevalence of multicore machines (computers with chips that have large numbers of pro œessors on them), all compilers will have to faœe the problem of taking advantage of multiprocessor machines. Thus, an extensive and useful theory has been built up around the problem of optimizing code. The use of a rigorous mathematical foundation allows us to show that an optimization is correct and that it produœs the desirable effect for all possible inputs. We shall see, starting in Chapter 9, how models such produce well optimized code.

On the other hand, pure theory alone is insufficient. Like many real-world problems, there are no perfect answers. In fact, most of the questions that we ask in compiler optimization are undecidable. One of the most important skills in compiler design is the ability to formulate the right problem to solve. We need a good understanding of the behavior of programs to start with and thorough experimentation and evaluation to validate our intuitions.
Compiler optimizations must meet the following design objectives:

• The optimization must be correct, that is, preserve the meaning of the compiled program,
• The optimization must improve the performance of many programs,
• The compilation time must be kept reasonable, and
• The engineering effort require d must be manageable.

## 电子工程代写|编译器代写Compilers代考|Memory Hier ar chies

A memory hierarchy consists of several levels of storage with different speeds and sizes, with the level closest to the processor being the fastest but smallest. The average memory-acœss time of a program is reduced if most of its accesses are satisfied by the faster levels of the hier archy. Both parallelism and the existence of a memory hierarchy improve the potential performance of a machine, but they must be harnessed effectively by the compiler to deliver real performan $e$ on application.

Memory hierarchies are found in all machines. A processor usually has a small number of registers consisting of hundreds of bytes, several levels of caches containing kilobytes to megabytes, physical memory containing megabytes to gigabytes, and finally secondary storage that contains gigabytes and beyond. Correspondingly, the speed of accesses between adjacent levels of the hierarchy can differ by two or three orders of magnitude. The performanœe a sy stem is often limite d not by the spee $d$ of the pro cessor but by the performance of the memory subsystem. While compilers traditionally focus on optimizing the processor execution, more emphasis is now placed on making the memory hier ar chy more effective.

Using registers effectively is probably the single most important problem in optimizing a program. Unlike registers that have to be managed explicitly in software, caches and physical memories are hidden from the instruction set and are managed by hardware. It has been found that cache-management policies implemented by hardware are not effective in some cases, especially in scientific code that has large data structures (arrays, typically). It is possible to improve the effectiveness of the memory hier ar chy by changing the layout of the data, or changing the order of instructions accessing the data. We can also change the layout of $\infty$ de to improve the effectiveness of in struction caches.

