What are the architecture of Parallel Processing?

There are three basic parallel processing hardware architectures in the server market such as symmetric multiprocessing (SMP), massively parallel processing (MPP), and non-uniform memory architecture (NUMA).

Symmetric Multiprocessing (SMP)

The SMP architecture is an individual device with multiple processors, all managed by one operating system and all accessing the similar disk and memory area. An SMP machine with 8 to 32 processors, a parallel database, large memory (two or more gigabytes), good disk, and a good design should perform well with a medium-sized warehouse.

The database needs to be able to run its processes in parallel, and the data warehouse processes need to be designed to take advantage of parallel capabilities. The processors can access shared resources (memory and disk) rapidly, but the access path they need to get at those resources, the backplane, can develop into a bottleneck as the system scales.

Since the SMP machine is a single entity, it also has the weakness of being a single point of failure in the warehouse. To overcome these problems, hardware companies have come up with techniques that allow several SMP machines to be linked to each other, or clustered.

In a cluster, each node is an SMP machine that runs its operating system, but the cluster includes connections and control software to allow the machines to share disks and provide fail-over backup. In this case, if one machine fails, others in the cluster can temporarily take over its processing load. Of course, this benefit comes at a cost—clustering is extremely complex and can be difficult to manage. The database technology needed to span clusters is improving.

Massively Parallel Processing (MPP)

MPP systems are a string of relatively independent computers, each with its operating system, memory, and disk, all coordinated by passing messages back and forth. The strength of MPP is the ability to connect hundreds of machine nodes and apply them to a problem using a brute-force approach.

For example, if you need to do a full-table scan of a large table, spreading that table across a 100-node MPP system and letting each node scan its 1/100th of the table should be relatively fast. It’s the computer equivalent of “many hands make light work.

Non-Uniform Memory Architecture (NUMA)

NUMA is a set of SMP and MPP in an attempt to merge the shared disk adaptability of SMP with the parallel speed of MPP. This architecture is a relatively current innovation, and it can be viable for data warehousing in the high run.

NUMA is conceptually similar to the idea of clustering SMP machines, but with tighter connections, more bandwidth, and greater coordination among nodes. If you can segment your warehouse into relatively independent usage groups and place each group on its node, the NUMA architecture may be effective for you.