FX Pipelines can be implemented as either universal or dedicated FX units. Furthermore, a processor can incorporate either a single universal unit multiple universal units.
All earlier and some current designs employ a single universal FX Pipeline, which is a single FX unit as shown in the figure. Here the adjective universal refers to the capability of executing all integers and Boolean operations of the processor. Besides the earlier pipelined processors of the 1980s, the i486, IBM Power1 (RS/6000), R (4000), HP 7100, DEC α 21064, PowerPC 601, and Power603 have a single universal FX pipeline and thus a single FX unit.
All earlier designs and several current designs, such as i486, also utilize the same universal FX pipeline for the execution of load/store and branch instructions. This seems to be quite natural since load/store and control transfer instructions require address calculations which can easily be carried out using the integer pipeline.
The disadvantage is that all loads/stores and branches are restricted to being performed sequentially with the integer and Boolean operations, which considerably impedes performance.
A further step in boosting performance can be achieved by applying multiple FX pipelines, that is, multiple pipelined FX units. As far as the number of FX units is concerned, it is worth referring to dynamic instruction distributions, which shows that 30-40% of all executed instructions are integer and Boolean. Therefore to exploit more parallelism, providing more than one FX unit seems inevitable.
In this method, it must be pointed out that integer division is usually not pipelined. For all processors, the division requires a considerable number of cycles (in the order of 10-100).
There are two possible approaches to multiply the number of FX pipelines as the first approach is to use multiple universal pipelines and thus multiple universal FX units for all supported integer and Boolean instructions. The other possibility is to employ multiple dedicated pipelines for different classes of integers and Boolean instructions, implemented as multiple dedicated FX units.
The other approach is to use a set of dedicated units, such as simple FX units, multiple/dividers, or separate multipliers and dividers, shifters, and so on. As shown in the figure, the early 1960 CA, the first superscalar processor, or several high-end superscalar processors such as the PowerPC 604, PowerPC 620, or R8000 serve as examples of this approach.
Except for the early 1960 CA, these processors usually contain two simple FX units (providing multiply/divide capabilities) and implement the multicycle integer operations either using a common dedicated multiplier/divider unit (Such as the PowerPC 604 or the R8000) or by a couple of separate multipliers and a separate divider, such as the MC 88110.