Hector is a hierarchical NUMA machine consisting of stations connected by a hierarchy of ring networks. Stations are symmetric multiprocessors where the processing modules are linked by an individual bus. Nodes comprise three main units − a processor/cache unit, a memory unit, and the station bus interface which connects the otherwise separated processor and memory buses.
The separation of two bus enables other processors to access this memory while the processor performs memory access operations in off-node memory. The processing modules of the machine are grouped into shared bus symmetric multiprocessors, called stations. These are connected by bit-parallel local rings, which are interconnected by a single global ring.
Hector provides a flat, global address space, where each processing module is assigned a range of addresses. The addressing scheme uses r+s+p bits where r indicates the ring, s points to the station, and p addresses the slot inside the station. Although global cache consistency cannot be maintained in Hector, a snoopy protocol provides cache consistency among the nodes inside a station.
Memory accesses take place in a synchronized packet-transfer scheme controlled by a hierarchy of interface circuits. The station bus interface connects processing modules to the station bus by forwarding station bus requests to the station controller.
When a processor requests board-memory access, it is the station bus interface that connects the processor bus to the memory bus. Off-board memory requests are transformed into request packets and passed by the station bus interface to the station controller.
The station controller has a twofold role. First, it controls the allocation of the station bus between on-station requests, and second, it realizes the local ring interface for the station. When a processing module requests the station bus and there is no contention on the bus, the station controller grants the bus at the beginning of the next cycle.
The processor module places the data packet on the bus in the same cycle. If the destination module belongs to the station, it acknowledges the reception of the packet in the next cycle. If the acknowledgment is not given, the source module automatically retransmits the request. An on-station transfer requires three cycles but only one of them ties up the bus and hence, by independent requests, the full bus bandwidth can be exploited.
The inter-ring interface is realized as a two-deep FIFO buffer that gives priority to packets moving in the global ring. It means that whenever a packet travels on the global ring it will reach its destination without delay.
There are three main advantages of the Hector machine are as follows −
The hierarchical structure enables short transmission lines and good scalability.
The cost and the complete bandwidth of the structure increase linearly with the various nodes.
The cost of memory access raises incrementally with the distance between the processor and memory areas.
The main drawbacks of Hector are typical for all the NUMA machines, lack of global cache consistency, and non-uniform memory access time which require careful software design.