Motivation                                                      Back

It has been reported that the instruction cache consumes a significant fraction of the total processor power. For e.g. 27% of the processor power is dissipated in the L1 instruction cache in Strong ARM 110. Cache partitioning is commonly used to reduce the dynamic power dissipation of caches since a smaller cache has a lower load capacitance. Block buffering proposes to buffer the last accessed cache line. If the data in the same cache line is accessed on the next request, only the buffer needs to be accessed. The filter cache adds a bigger buffer (L0 cache) to cache recently accessed cache blocks. On each access, the L0 cache is 1st accessed. The L1cache is only accessed when an L0 miss occurs. This approach can achieve more energy reduction compared with the block buffering mechanism; however, it could cause significant performance degradation if an application’s working set cannot be captured in the small L0 cache. This performance degradation could be as high as 20%.

HotSpot cache is a novel instruction cache architecture in which a dynamic steering mechanism is employed to direct a request to either the L0 cache or the L1 cache (unlike a filter cache where the L0 and L1 caches are accessed sequentially). The design goal is to achieve energy savings comparable to the filter cache without sacrificing performance.

An in-depth tutorial on hotspot cache is available here. - Tutorial

Analysis of different configurations within Hotspot cache architecture is shown in the simulations performed. Analysis