Performance distribution of a fault-tolerant system in the presence of failure correlation

IIE Transactions, June, 2006 by Gregory Levitin, Min Xie

Nicola and Goyal (1990), Gutjahr (2001), Czarnowski et al. (2003) and Dai et al. (2004) have studied the reliability of multi-version software when correlated failures are present. However, these papers only analyze the probability that the system produces a correct output and do not address the issue of the software performance (running time).

This paper presents an algorithm for fault-tolerant software performance and reliability evaluation that takes into account the Common-Cause Failures (CCFs) of versions. Section 2 presents the model in detail, and Sections 3-6 present the main results and algorithms. In Section 7, some analytical and numerical examples are used to illustrate the approach. Conclusions are drawn in Section 8.

2. The model

2.1. Software system performance

In the model presented in Ashrafi et al. (1994) the software system consists of C components with each component performing a subtask and the sequential execution of the components allows a major task to be performed. It is assumed that [N.sub.c] functionally equivalent versions are available for each component c. Each version has a fixed execution time. The parallel execution of no more than [L.sub.c] versions is possible in each component.

The versions of each component c start their execution in accordance with an ordered list. [L.sub.c] first versions from the list start their execution simultaneously (at time 0). If the number of terminated versions is less than [M.sub.c], then after termination of each version a new version from the list starts its execution immediately. If the number of terminated versions is greater than [M.sub.c] then after termination of each version the voter compares the outputs. If the [M.sub.c] outputs are identical then the component terminates its execution (terminating all the versions that are still being executed), otherwise a new version from the list is executed immediately. If after the termination of [N.sub.c] versions the number of identical outputs is less then [M.sub.c] then the component and the entire system fail.

In the case of component success, the time of the entire component execution is equal to the termination time of the version that produced the [M.sub.c]th correct output (in most cases the time needed by the voter to make the decision can be neglected). It can be seen that the component execution time is a random variable that depends on the outputs of the component versions.

The RBS technique is very similar to NVP with sequential execution of the versions. The main difference lies in the use of the acceptance test block after the execution of each version (the acceptance test time is usually much greater than the decision time of the voter and therefore it cannot be neglected). By adding the acceptance test time to the execution time of each version one can consider the RBS to be NVP with [L.sub.c] = 1 and [M.sub.c] = 1 since the component has succeeded when the first correct output is obtained.

The time [[PSI].sub.c] at which each component c produces a correct output depends on the number of failed versions. If this number is greater than [N.sub.c] - [M.sub.c], then the component fails and [[PSI].sub.c] = [infinity]. The sum of the random execution times of each component gives the random task-execution time T for the entire system.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale