Published by MIT Press. Copyright 1997 Massachusetts Institute of Technology.
Your institution may already be a subscriber to CJTCS. If not, please subscribe for legitimate access to all journal articles.
Two aspects of distributed-protocol reliability are the ability to recover from transient faults and the ability to function in a dynamic environment. Approaches for both of these aspects have been separately developed, but have revealed drawbacks when applied to environments with both transient faults and dynamic changes. This paper introduces definitions and methods for addressing both concerns in system design.
A protocol is superstabilizing if it is (1) self-stabilizing, meaning that it is guaranteed to respond to an arbitrary transient fault by eventually satisfying and maintaining a legitimacy predicate, and it is (2) guaranteed to satisfy a passage predicate at all times when the system undergoes topology changes starting from a legitimate state. The passage predicate is typically a safety property that should hold while the protocol makes progress toward reestablishing legitimacy following a topology change.
Specific contributions of the paper include: the definition of the class of superstabilizing protocols; metrics for evaluating superstabilizing protocols; superstabilizing protocols for coloring and spanning tree construction; a general method for converting self-stabilizing protocols into superstabilizing ones; and a generalized form of a self-stabilizing topology update protocol which may have useful applications for other research.