Unlike individual reservation stations, a group or central reservation station, or a DRIS must be efficient in dispatching higher than one instruction in each cycle. In these cases, the design space needs an additional component that determines how many instructions can be dispatched from each of the reservation stations or the DRIS per cycle. This component is called dispatch rate.
A shelving buffer must be capable of dispatching one instruction to any EU connected to it in each cycle. This is easier to achieve for group stations with two to three EUs than for a central station or a DRIS with a considerable number of EUs connected to it. The R10000, for instance, employs group reservation stations. Its FX reservation station can dispatch two instructions per cycle, one each to the EUs served.
In contrast, its FP reservation serves four FP EUs, but can only dispatch up to two instructions per cycle. One instruction can be forwarded to the FP Adder, and one to either the FP multiplier, FP Divider, or FP Square root unit. These dispatch rate limitations are mainly due to data path or register port limitations aimed at reducing complexity.
In the case of a central reservation station or DRIS, higher dispatch is required than for group stations. For instance, the PentiumPro can dispatch five RISC instructions (called nop-s) per cycle. It can be noted that port 0 is shared by six EUs. This is a design trade-off. In the PentiumPro, FP data requires 86 bits internally. Considering that each FP unit needs at least two operands and delivers at least one result, a considerable saving can be achieved in the die area by sharing one complex input/output port.
The maximum issues and dispatch rates of superscalar processors with shelving as shown in the table.
Maximum issue and dispatch rates of superscalar processors with shelving
|Processor/Year of volume shipment||Maximum issue rate instr/cycle||Maximum dispatch rate instr/cycle|
|PowerPC 603 (1993)||3||3|
|PowerPC 604 (1995)||4||6|
|PowerPC 620 (1995)||4||6|
|PM1 (Sparc 64) (1995)||4||8|
As shown in the table, in some cases both rates are the same, for instance in the PowerPC 603 and PA 8000 processors. In most cases, superscalar processors with shelving are capable of dispatching more instructions for execution than of issuing them for shelving. For example, the PowerPC 604 and PowerPC 620 issue up to four instructions but can start the execution of up to six operations in each cycle.