Shared memory systems can be designed using bus-based or switch-based interconnection networks. The simplest network for shared memory systems is the bus. The bus/cache architecture alleviates the requirement for expensive multiport memories and interface circuitry and the need to adopt a message-passing paradigm when developing application software.
The bus may get saturated if multiple processors are trying to access the shared memory (via the bus) simultaneously. A typical bus-based design uses caches to solve the bus contention problem. High-speed caches connected to each processor on one side and the bus on the other side mean that local copies of instructions and data can be supplied at the highest possible rate.
If the local processor finds all of its instructions and data in the local cache, we say the hit rate is 100%. The miss rate of a cache is the fraction of the references that cannot be satisfied by the cache, and so must be copied from the global memory, across the bus, into the cache, and then passed on to the local processor.
One of the goals of the cache is to maintain a high hit rate or low miss rate under high processor loads. A high hit rate means the processors are not using the bus as much. Hit rates are determined by several factors, ranging from the application programs being run to how to cache hardware is implemented.
A processor goes through a duty cycle, where it executes instructions a certain number of times per clock cycle. Typically, individual processors execute less than one instruction per cycle, thus reducing the number of times it needs to access memory.
Subscalar processors execute less than one instruction per cycle, and superscalar processors execute more than one instruction per cycle. In any case, we want to minimize the number of times each local processor tries to use the central bus. Otherwise, processor speed will be limited by bus bandwidth.
We define the variables for hit rate, number of processors, processor speed, bus speed, and processor duty cycle rates as follows −
N = Number of processors
h = hit rate of each cache assumed to be the same for all caches
(1-h) = miss rate of all caches
B = bandwidth of the bus, measured in cycles/second
I = processor duty cycle assumed to be identical for all processors, in fetches/ cycles and
V = peak processor speed, in fetches/second.
The effective bandwidth of the bus is BI fetches/second. If each processor is running at a speed of V, then misses are being generated at a rate of V (1 - h). For an N-processor system, misses are simultaneously being generated at a rate of N (1 - h) V.
This leads to saturation of the bus when N processors simultaneously try to access the bus. That is, N (1 - h) V ≤ BI. The maximum number of processors with cache memories that the bus can support is given by the relation