Tech Report CS-08-08

Memory Hierarchy Issues in Multicore Architectures

J. Savage and M. Zubair

July 2008


Multicore architectures have introduced a new problem to parallel computing, namely, the management of hierarchical parallel caches. As with other architectures, a cache structure is designed to simulate a fast common memory. To address the challenge of management of these caches we a) introduce the Unified Multicore Model (UMM), a hierarchical arrangement of caches, b) present a general strategy to develop lower bounds on the memory traffic between a cache and its child caches, and c) present highly efficient implementations for two compute-intensive financial option pricing problems.

The UMM seamlessly handles different types of multiple-core processors with varying degrees of sharing of caches at different levels. Our lower bound method is general and applies to a great variety of problems whose algorithms are described by straight-line programs. It extends results obtained previously for serial hierarchical memories \cite{savageio}. The experimental work was done on a system with two Quad-Core Intel Xeon 5310 1.6GHz processors having a total of 8 cores. For option pricing problems we demonstrate a performance of 19.5 GFLOP, which is 38% of the theoretical peak for this system. Our code outperforms the compiler optimized and auto-parallelized code by a factor of up to $7.5$.

(complete text in pdf)