This is a basic Cache Tutorial

Introduction

The goal of an effective memory system is that the effective access time that the processor sees is very close to t_o, the access time of the cache. Most accesses that the processor makes to the cache are contained within this level. The achievement of this goal depends on many factors: the architecture of the processor, the behavioral properties of the programs being executed, and the size and organization of the cache. Caches work on the basis of the locality of program behavior. There are three principles involved:

Spatial Locality - Given an access to a particular location in memory, there is a high probability that other accesses will be made to either that or neighboring locations withing the lifetime of the program.
Temporal Locality - This is complementary to spatial locality. Given a sequence of references to n locations, there is a high probability that references following this sequence will be made into the sequence. Elements of the sequence will again be referenced during the lifetime of the program.
Sequentiality- Given that a reference has been made to a particular location s it is likely that within the next several references a reference to the location of s + 1 will be made. Sequentiality is a restricted type of spatial locality and can be regarded as a subset of it.

Some common terms

Processor reference that are found in the cache are called cache hits. References not found in the cache are called cache misses. On a cache miss, the cache control mechanism must fetch the missing data from memory and place it in the cache. Usually the cache fetches a spatial locality called the line from memory. The physical word is the basic unit of access in the memory.
The processor-cache interface can be characterized by a number of parameters. Those that directly affect processor performance include:

Access time for a reference found in the cache (a hit) - property of the cache size and organization.
Access time for a reference not found in the cache (a miss) - property of the memory organization.
Time to initially compute a real address given a virtual address (not-in-TLB-time) - property of the address translation facility, which, though strictly speaking, is not part of the cache, resembles the cache in most aspects and is discussed in this chapter.

Cache Organization

Within the cache, there are three basic types of organization:

Direct Mapped
Fully Associative
Set Associative

In fully associative mapping, when a request is made to the cahce, the requested address is compared in a directory against all entries in the directory. If the requested address is found (a directory hit), the corresponding location in the cache is fetched and returned to the processor; otherwise, a miss occurs.

Click here for link to source of this graphic

Fully Associative Cache

In a direct mapped cache, lower order line address bits are used to access the directory. Since multiple line addresses map into the same location in the cache directory, the upper line address bits (tag bits) must be compared with the directory address to ensure a hit. If a comparison is not valid, the result is a cache miss, or simply a miss. The address given to the cache by the processor actually is subdivided into several pieces, each of which has a different role in accessing data.

Direct Mapped Cache

The set associative cache operates in a fashion somewhat similar to the direct-mapped cache. Bits from the line address are used to address a cache directory. However, now there are multiple choices: two, four, or more complete line addresses may be present in the directory. Each of these line addresses corresponds to a location in a sub-cache. The collection of these sub-caches forms the total cache array. In a set associative cache, as in the direct-maped cache, all of these sub-arrays can be accessed simultaneously, together with the cache directory. If any of the entries in the cache directory match the reference address, and there is a hit, the particular sub-cache array is selected and outgated back to the processor.

Set Associative Cache

A number of tools have been included as part of this web-based Cache tutorial. They are:

A tool to help the user to visualize the cache structure. The user can input a number of system (main memory size, cache memory size, block size etc.) parameters and view the address bit pattern and its partitioning.
A tool to help the user determine how blocks (or data) from main memory is placed in the cache. This too is a graphic based interactive tool, wherein the user can input any address pattern and view how it is partitioned and the relevant data accessed.
A tool to help the user visualize how data is accessed to and from the cache. Replacement policies (LRU, FIFO etc) are reviewed and a graphical tool will further help the user understand the operation of cache access.

A tool to help the user compute the average access time due depending on the read or write policies in effect.

Go to main menu