A Hierarchical Formal Framework for Adaptive N-variant Programs in Multi-core Systems

Li Tan and Axel Kring. In Proceedings of The 9th International Workshop on Assurance in Distributed Systems and Networks (ICDCS-ASDN'10). IEEE press. Genoa, Italy. June, 2010.

Download (pdf)

Abstract

We propose a formal framework for designing and developing adaptive N-variant programs. The framework supports multiple levels of fault detection, masking, and recovery though reconfiguration. Our approach is two-fold: we introduce an Adaptive Functional Capability Model (AFCM) to define levels of functional capabilities for each service provided by the system. The AFCM specifies how, once a fault is detected, a system shall scale back its functional capabilities while still maintaining essential services. Next, we propose a Multi-layered Assured Architecture Design (MAAD) to implement reconfiguration requirements specified by AFCMs. The layered design improves system resilience in two dimensions: (1) unlike traditional fault-tolerant architectures that treat functional requirements uniformly, each layer of the assured architecture implements a level of functional capability defined in AFCM. The architecture design uses lower-layer functionalities (which are simpler and more reliable) as reference to monitor high-layer functionalities. The layered design also facilitates an orderly system reconfiguration (graceful degradation) while maintaining essential system services. (2) each layer of the assured architecture uses N-variant techniques to improve fault detection. The degree of redundancy introduced by N-variant implementation determines the mix of faults that can be tolerated at each layer. Our hybrid fault model allows us to consider fault types ranging from benign faults to Byzantine faults. Last but not the least, multi-layers combined with N-variant implementations are especially suitable for multi-core systems.