Experimental approach

  1. For a fixed L1 cache size (64kB) and L2 cache size (512kB), the following parameters were varied: block-size, replacement-policy and associativity. 3 SPEC2000 benchmarks were used for experimentation, namely: bzip00, gzip00 and art00*. These PISA binaries (WATTCH can only run binaries in PISA format) were obtained from the SimpleScalar website: http://www.eecs.umich.edu/mirv/benchmarks/benchmarks.html
  2. The power values for the different functional units and the # of simulation cycles needed were obtained by executing the binaries using WATTCH (in 180nm technology). HOTSPOT is an architectural level thermal simulator that intakes a power-trace file and a floorplan file and outputs the steady-state temperature values for the different functional units (block-level granularity). We used the default Alpha EV6 floorplan and used the power traces obtained from WATTCH (for each of the benchmarks) to run HOTSPOT. The corresponding temperature values were stored in a database along with the energy and delay values obtained previously.
  3. The following modules of WATTCH were modified: power.c, sim-outorder.c and cache.c, to implement a Filter Cache of size 128/256 bytes with block size varying from 8bytes to 32 bytes and with the option of direct-mapping or full-associativity. The power values of the Instruction Cache and the total # of simulation cycles were obtained for each of the 3 benchmarks for all possible configurations of the Filter Cache and a corresponding database was formed. As in the case of the traditional 2-level cache, the power values obtained from WATTCH were used to generate the power trace file. The EV6 floorplan was modified to account for the area-addition due to the Filter Cache. HOTSPOT was run and temperature values of the I-Cache for each of the benchmarks were obtained (for different configurations of the Filter Cache).
  4. A web-based tool was developed to perform energy and temperature comparison of the 3 cache configurations: traditional 2-level, Filter and HotSpot.

    * SPEC 2000 benchmarks like gcc00, vpr00 and vortex00 require more than 1 hour for a single performance simulation using SimpleScalar and hence, were not used for this analysis (given the wide variety of configurations that had to be tested).

    Benchmark -->      L0 Cache Size -->