What is NUMA?


NUMA represents Non-uniform Memory Access. NUMA is a multiprocessor model in which each processor is connected with the dedicated memory. Non-uniform memory access (NUMA) machines were intended to prevent the memory access bottleneck of UMA machines. The logically shared memory is physically assigned among the processing nodes of NUMA machines, leading to distributed shared memory architectures.

These parallel computers became hugely scalable, but they are very responsive to data allocation in local memories. Accessing a local memory segment of a node is much quicker than accessing a remote memory segment.

The main difference is in the organization of the address space. In multiprocessors, a global address space is used that is consistently visible from each processor. Several processors can access all memory areas.

In multicomputer, the address space is replicated in the local memories of the processing elements (PEs). There is no PE is allowed to directly access the local memory of another PE.

This difference in the address space of the memory is also reflected at the software level: distributed memory multicomputer is programmed based on the message-passing paradigm, while NUMA machines are programmed based on the global address space principle.

These machines have become more and more difficult in recent parallel computers, like the Cray T3D, where both programming paradigms are provided to the user in the form of library packages.

A further aspect that makes the difference even smaller comes from the fact that the actual form of accessing remote memory modules is the same in both classes of MIMD computers. Remote memory access is realized by messages even in the NUMA machines, similarly to the message-passing multicomputer.

The problem of cache coherency does not occur in distributed memory multicomputer because the message-passing paradigm explicitly manages several copies of the equivalent data structure in the form of autonomous messages.

In the shared memory paradigm, multiple access to a similar global data structure is applicable and can be increased if local copies of the global data structure are preserved in local caches.

The hardware-provided cache consistency schemes are not introduced into the NUMA machines. These systems can cache read-only code and data, and local data, but not shared modifiable information. This is the distinctive feature between NUMA and CC-NUMA multiprocessors. NUMA machines are adjacent to multicomputer than to other shared-memory multiprocessors, while CC-NUMA machines express real shared memory systems.

In NUMA machines, such as multicomputer, the main design issues are the organization of processor nodes, the interconnection network, and the possible approaches to lower remote memory accesses. Typical NUMA machines are the Cray T3D and the Hector multiprocessor.

Updated on: 23-Jul-2021

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements