What is Cache Memory in Computer Architecture?

The data or contents of the main memory that are used generally by the CPU are saved in the cache memory so that the processor can simply create that information in a shorter time. Whenever the CPU requires to create memory, it first tests the cache memory. If the data is not established in cache memory, so the CPU transfers into the main memory.

Cache memory is located between the CPU and the main memory. The block diagram for a cache memory can be represented as −

The concept of reducing the size of memory can be optimized by placing an even smaller SRAM between the cache and the processor, thereby creating two levels of cache. This new cache is usually contained inside the processor. As the new cache is put inside the processor, the wires connecting the two become very short, and the interface circuitry becomes more closely integrated with that of the processor.

These two conditions together with the smaller decoder circuit facilitate faster data access. When two caches are present, the cache within the processor is referred to as a level 1 or L1 cache. The cache between the L1 cache and memory is referred to as a level 2 or L2 cache.

The figure shows the placement of L1 and L2 cache in memory.

The split cache, another cache organization, is shown in the figure. Split cache requires two caches. In this case, a processor uses one cache to store code/instructions and a second cache to store data.

This cache organization is typically used to support an advanced type of processor architecture such as pipelining. Here, the mechanisms used by the processor to handle the code are so distinct from those used for data that it does not make sense to put both types of information into the same cache.

The success of caches depends upon the principle of locality. The principle proposes that when one data item is loaded into a cache, the items close to it in memory should be loaded too.

If a program enters a loop, most of the instructions that are part of that loop are executed multiple times. Therefore, when the first instruction of a loop is being loaded into the cache, it loads its bordering instructions simultaneously to save time. In this way, the processor does not have to wait for the main memory to provide subsequent instructions.

As a result of this, caches are organized in such a way that when one piece of data or code is loaded, the block of neighbouring items is loaded too. Each block loaded into the cache is identified with a number known as a tag.

This tag can be used to find the original addresses of the data in the main memory. Therefore, when the processor is in search of a piece of data or code (hereafter referred to as a word), it only needs to check the tags to see if the word is contained in the cache.