HyperLink   Data Relocation and Prefetching for Large Data Sets.
Publication Year:
  Yoji Yamada, John C. Gyllenhaal, Grant E. Haab, Wen-mei Hwu
  Proceedings of the 27th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 118-127, December, 1994

Numerical applications frequently contain nested loop structures that process large arrays of data. The execution of these loop structures often produces memory preference patterns that poorly utilize data caches. Limited associativity and cache capacity result in cache conflict misses. Also, non-unit stride access patterns can cause low utilization of cache lines. Data copying has been proposed and investigated in order to reduce the cache conflict misses but this technique has a high execution overhead since it does the copy operations entirely in software.
We propose a combined hardware and software technique called data relocation and prefetching which eliminates much of the overhead of data copying through the use of special hardware. Furthermore, by relocating the data while performing software prefetching, the overhead of copying the data can be reduced further. Experimental results for data relocation and prefetching are encouraging and show a large improvement in cache performance.