Improvements in main memory speeds have not kept pace with increasing 
processor clock frequency and improved exploitation of instruction-level
 parallelism. Consequently, the gap between processor and main memory 
performance is expected to grow, increasing the number of execution 
cycles spent waiting for memory accesses to complete. One solution to 
this growing problem is to reduce the number of cache misses by 
increasing the effectiveness of the cache hierarchy. In this paper we 
present a technique for dynamic analysis of program data access 
behavior, which is then used to proactively guide the placement of data 
within the cache hierarchy in a location-sensitive manner. We introduce 
the concept of a macroblock, which allows us to feasibly characterize 
the memory locations accessed by a program, and a Memory Address Table, 
which performs the dynamic reference analysis. Our technique is fully 
compatible with existing Instruction Set Architectures. Results from 
detailed simulations of several integer programs show significant 
speedups.