This dissertation demonstrates that substantial speedup over that for conventional
single-instruction-issue architectures can be achieved by multiple-instruction-issue architectures
with the support of an optimizing compiler. We have constructed a full-scale
C compiler that can learn the dynamic behavior of user programs by profiling, apply
the profile information to guide various code improving techniques, and map the program
parallelism onto the parallel architecture. Our base code optimization technology
is comparable to today's best commercial C compilers. In addition, we have developed
aggressive code generation techniques that are tailored to multiple-instruction-issue architectures.
Using our compiler, we have characterized the performance of a large class of
multiple-instruction-issue architectures with many important application programs and
realistic input data.