Tech Report CS-89-35

Efficient Parallel Algorithms Can Be Made Robust

Paris C. Kanellakis and Alex A. Shvartsman

June 1989

Abstract:

The efficient parallel algorithms proposed for many fundamental problems, such as list ranking, computing preorder numberings on trees and integer sorting, are very sensitive to processor failures. The requirements of efficiency (commonly formalized using {\em Parallel-time x Processors} as a cost measure) has led to the design of highly tuned PRAM algorithms which, given the additional constraint of simple processor failures, unfortunately become inefficient or even incorrect. We propose a new notion of {\em robustness} that combines efficiency with fault tolerance. For the common case of fail-stop errors, we develop a general and easy-to-implement technique to make robust many efficient parallel algorithms, e.g., algorithms for all the problems listed above. More specifically,{\em for any dynamic pattern of fail-stop errors on} a CRCW PRAM {\em with at least one surviving processor}, our method increases the original algorithm's cost by at most a {\em polylog} multiplicative factor. We also show that at least a {\em log/loglog} multiplicative overhead will be incurred for certain patterns of failures by any algorithm that implements robust solutions to the problems listed.

(complete text in pdf)