The growing disparity between processor and memory performance has made
cache misses increasingly expensive. Additionally, data and instruction
caches are not always used efficiently, resulting in large numbers of cache
misses. Therefore, the importance of cache performance improvements at each
level of the memory hierarchy will continue to grow. In numeric programs
there are several known compiler techniques for optimizing data cache
performance. However, integer (non-numeric) programs often have irregular
access patterns that are more difficult for the compiler to optimize.
In the past, cache
management techniques such as cache bypassing were implemented manually at
the machine-language-programming level.
As the available chip area grows, it makes sense to spend more resources
to allow intelligent control over the cache management.
In this paper we present an approach to improving cache effectiveness,
taking advantage of the growing chip area, utilizing run-time
adaptive cache management techniques, optimizing both performance
and cost of implementation. Specifically, we are aiming to increase
data cache effectiveness for integer programs. We propose
a microarchitecture scheme where the hardware determines data
placement within the cache hierarchy based on dynamic referencing behavior.
This scheme is fully compatible with existing Instruction Set Architectures.
This paper examines the theoretical upper bounds on the cache hit
ratio that cache bypassing can provide for integer applications,
including several Windows applications with OS activity.
Then, detailed trace-driven simulations of the integer applications
are used to show that the implementation described in this paper
can achieve performance close to that of the upper bound.