HyperLink   Unrolling-Based Optimizations for Modulo Scheduling.
   
Publication Year:
  1995
Authors
  Daniel M. Lavery, Wen-mei Hwu
   
Published:
  Proceedings of the 28th Annual International Symposium on Microarchitecture, pp. 327-337, Dec. 1995
   
Abstract:

Modulo Scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction- level parallelism to fully utilize high-issue-rate processors. The achieved throughput modulo scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to modulo scheduling. However, there are benefits to unrolling even if the loop is to be modulo scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource require- ments and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of un- rolling for five of the SPEC CFP92 programs are reported.