The flow of vector operands between the main memory and vector registers is generally pipelined with various access paths. In this section, we specify vector operands and describe three vector-access schemes from interleaved memory modules allowing overlapped memory accesses.
Vector Operand Specifications − Vector operands can have arbitrary lengths. Vector elements are not essentially saved in contiguous memory areas. For example, the entries in a matrix may be stored in row-major or in column-major. Each row, column, or diagonal of the matrix can be used as a vector.
When row elements are stored in contiguous locations with a unit stride, the column elements must be stored with a stride of n, where n is the matrix order. Similarly, the diagonal elements are also separated by a stride of n +1.
To access a vector in memory, one must specify its base address, stride, and length. Since each vector register has a fixed number of component registers, only a segment of the vector can be loaded into the vector register in a fixed number of cycles. Long vectors must be segmented and processed one segment at a time.
C-Access Memory Organization − The m-way low-order interleaved memory structure allows m memory words to be accessed together in an overlapped structure. This concurrent access has been known as C-access.
The access cycles in various memory modules are staggered. The low-order bits choose the modules, and the high-order 6 bits select the word within every module, where 7n = 2° and a + b = n is the address length.
S-Access Memory Organization − The low-order interleaved memory can be rearranged to enable simultaneous access or S-access. In this method, all memory modules are created simultaneously in a synchronized method.
C/S-Access Memory Organization − A memory organization in which the C-access and S-access are combined is called C/S-access. This scheme where n access buses are applied with m interleaved memory modules connected to every bus.
The m modules on each bus arc m-way interleaved to enable C-access. The n buses work in parallel to enable S-access. In each memory cycle, at most m • n-words are fetched if the n buses are completely used with pipelined memory accesses.
The C/S-access memory is suitable for use in vector multiprocessor configurations. It provides parallel pipelined access to a vector data set with high bandwidth. A particular vector cache design is required within each processor to maintain smooth data movement between the memory and multiple vector processors.