HyperLink   Enhancing Loop Buffering of Media and Telecommunications Applications Using Low-overhead Predication.
Publication Year:
  John W. Sias, Matthew C. Merten, Erik M. Nystrom, Ronald D. Barnes, Christopher J. Shannon, Joseph D. Matarazzo, Shane Ryoo, Jeff V. Olivier, Wen-mei Hwu
  Proceedings of the 34th International Symposium on Microarchitecture, December, 2001

Loops containing control flow are problematic for VLIWs relying on loop buffers. Full predication increases code size, while partial predication does not support general if-conversion. Here a compromise approach is proposed and evaluated using media applications. Compiler techniques are demonstrated which arrange for 70-99% of fetched operations to come from a statically managed 256-instruction loop buffer, allowing instruction fetch power savings and eliminating branch penalties. Also introduced is a form of predication specialized to permit if-conversion with one bit in each operation and to eliminate much of the hardware overhead of a predicate register-based approach.