Research papers

Please note

  • These papers are available for download subject to the standard copyright restrictions
  • A copy may be made for personal research use only
  • The document may not be copied or sold to third parties
  • Quoted extracts should acknowledge the source document
  • Use of larger portions of the document require the permission of the copyright holder
  • Recent publications are listed first

Content links

Summaries

Overview of Approaches to the Use and Licensing of COTS Digital Devices in Safety Critical Industries

Authors

Sofia Guerra and Gareth Fletcher

Details

12th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2021)

Brief summary

Commercial-Off-The-Shelf (COTS) components are increasingly used in nuclear Instrumentation and Control (I&C) applications. They have several commercial advantages, as nuclear specific products may not be available and the cost of developing bespoke components may be prohibitive. In addition, commercial components typically benefit from a wider user base, and therefore, greater amounts of operating data that increase the chances of detecting (and fixing) systematic faults. While there are several commercial benefits in the use of COTS components, there are also several challenges and concerns with regard to their safety demonstration and justification.

This paper summarises a report that considered the use of COTS components in a range of safety-critical applications.

Download

The Role of Certification in the Safety Demonstration of COTS EDDs

Authors

Sofia Guerra and Luke Hinde

Details

12th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2021)

Brief summary

Embedded digital COTS devices are increasingly being used in Nuclear Power Plants. Although these devices are often not developed according to nuclear standards, they still need to be justified to be deployed in nuclear applications. Different countries have been developing their own processes to justify COTS digital devices. In many cases, this justification is based on the assessment of the development process. This is consistent with traditional standard-based approaches to safety justification – compliance to accepted practice was deemed to imply adequate safety. This could be demonstrated either directly through a review of the development artefacts or indirectly through consideration of existing certification, e.g., IEC 61508.

This paper discusses the use of development process-based approaches to the safety justification of EDDs COTS components, the link between development processes and reliability, how certification may support the justification, and some of the pitfalls of relying on certification.

Download

A conservative confidence bound for the probability of failure on demand of a software-based system based on failure-free tests of its components

Authors

Peter Bishop and Andrey Povyakalo

Details

Journal of Reliability Engineering & System Safety, 203 (2020): 107060

Brief summary

The standard approach to deriving the confidence bound for the probability of failure on demand (pfd) of a software-based system is to perform statistical tests on the whole system as a "black-box". In practice, performing tests on the entire system may be infeasible for logistical reasons, such as lack of availability of all component subsystems at the same time during implementation.

This paper presents a general method for deriving a confidence bound for the overall system from successful independent tests on individual system components. In addition, a strategy is presented for optimising the number of tests allocated to system components for an arbitrary system architecture that minimises the confidence bound for the system pfd.

For some system architectures, we show that an optimum allocation of component tests is as effective as tests on the complete system for demonstrating a given confidence bound. The confidence bound calculation makes use of many of the concepts used in the reliability analysis of hardware structures, but unlike a conventional hardware analysis, the method does not presume statistical independence of failures between software components, so the confidence bound calculation for the software should always be conservative.

Download

Justifying PLC-based applications with limited cooperation from platform supplier - the COGS approach

Authors

Gareth Fletcher and Sofia Guerra

Details

11th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2019)

Brief summary

Several control and monitoring applications are implemented using commercial-off-the-shelf (COTS) PLCs that were not necessarily developed according to nuclear standards. The UK nuclear regulatory regime requires that a safety case be developed to justify and communicate their safety. Typically, the assessment of COTS components has been done with a focus on standards compliance – compliance to accepted practice was deemed to imply adequate safety. However, there may be a number of difficulties with justifying COTS products related to limited knowledge of the internal structure of the components or their development processes, especially when the supplier of the PLC platform is not willing to provide the necessary information to complete a compliance case.

This paper describes a claim-based approach to the justification of COTS PLC components using Cogs, developed in a project funded by the UK nuclear industry. The approach focuses on the behaviour of the system rather than on the process followed to develop the PLC platform structures the justification around behaviour attributes (such as functionality, performance and reliability) and considers them in terms of the application and/or platform uses information about the platform that is likely to be publicly available from the supplier.

Download

Emphasis class 1 and class 2 assessment of Rosemount pressure and temperature transmitters

Authors

Emily Saopraseuth, Nicholas Wienhold, Eoin Butler, Sofia Guerra, Heidy Khlaaf

Details

11th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2019)

Brief summary

This paper describes the Class 1 assessment of the Rosemount 3051 Pressure Transmitter and the Class 2 assessment of the Rosemount 644 Temperature Transmitter using Emphasis at SIL 3 and 2 respectively.

Emerson has pursued many approvals and certifications on these transmitter platforms. The audit for each of these assessments is unique and probes information at varying levels of detail. As compared to other approvals and certifications audits, the Emphasis assessment is a much more productive and in-depth review of design and project materials. The assessment is focused on reviewing quality procedures, design and project artefacts that prove practical engineering practices, and processes that would lead to good product design.

This paper describes Emerson’s approach to the assessment. For this assessment, Emerson answered the over 300 assessment questions and provided over 150 archived documents as evidence for each individual product. Throughout the assessment, Emerson’s knowledge of IEC 61508, quality standards, product development processes and software engineering practices showed that, as a smart device manufacturer, Emerson is approaching design processes and procedures with the necessary rigor to produce devices capable of meeting the most stringent requirements. Key Words: smart devices, safety demonstration, embedded digital devices.

Download

Templates, databases and other harmonised approaches to the safety justification of embedded digital devices.

Authors

Gareth Fletcher, Sofia Guerra and Nick Chozos

Details

11th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2019)

Brief summary

This paper describes work funded by Energiforsk to consider the feasibility of using harmonised component level safety demonstration and, in particular, on using aspects of the UK approach to licensing and qualification of smart devices in Finland. We concluded that the use of harmonised component justification is feasible. In shorter timescales, this seems more likely to succeed if such an approach is developed within Finland. Using the assessments performed in the UK in Finland would have several advantages, but there are a number of technical and commercial issues that would need to be overcome for this to be feasible. Key Words: Embedded digital devices, commercial-off-the-shelf components, smart devices

Download

Justification of commercial industrial instrumentation and control equipment for nuclear power plant applications.

Authors

Sofia Guerra, Steven Arndt, Janos Eiler, Ron Jarrett, Horst Miedl, Andrew Nack, Paolo Picca

Details

11th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2019)

Brief summary

This paper discusses work done by the authors to develop an IAEA Nuclear Energy Series report to provide guidance on what would constitute an adequate justification process for a COTS device to be installed in a NPP for important to safety applications such that there is reasonable assurance of high quality and that the application of the COTS does not introduce new, unanalysed failure modes.

The publication provides a process for justification of digital COTS devices that may be used to guide the incorporation of these devices into the design of I&C systems important to safety, such that there is sufficient evidence to demonstrate that these products have adequate integrity to meet the requirements for their intended nuclear applications.

Download

Safety Demonstration of a Class 1 Smart Device

Authors

Sofia Guerra, Eoin Butler, Sam George

Details

10th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2017), June 11-15, 2017, San Francisco, USA

Brief summary

Horizon Nuclear Power intends to build Advanced Boiling Water Reactors (ABWR) at Wylfa and Oldbury in the UK, based on the Hitachi design. In accordance with UK policy for new nuclear build, Hitachi, as the reactor designer, is the requesting party to the Generic Design Assessment (GDA) during which the reactor design will be reviewed by the Office for Nuclear Regulation (ONR) and the Environment Agency. This paper describes the scope, criteria, process, and approach for the safety class 1 (SC1) pilot study and summarizes the results of the study.

Download

V&V Techniques for FPGA-Based I&C Systems – How Do They Compare with Techniques for Microprocessors?

Authors

Sam George and Sofia Guerra

Details

10th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2017), June 11-15, 2017, San Francisco, USA

Brief summary

We compare verification and validation (V&V) techniques for FPGA and microprocessor-based instrumentation and control (I&C) systems from the point of view of standards compliance, an approach based on behavioural properties, and the analysis of vulnerabilities. We found that the non-technology-specific elements of the standards considered are very similar. Differences are more marked when considering behavioural properties and vulnerabilities: the amount of effort required and confidence level obtained depend on a number of properties of the particular design under verification.

Download

Security-Informed Safety: Integrating Security Within the Safety Demonstration of Smart Device

Authors

Robin Bloomfield, Eoin Butler, Sofia Guerra and Kate Netkachova

Details

10th International Conference on Nuclear Plant Instrumentation, Control, and Human-Machine Interface Technologies (NPIC & HMIT 2017), June 11-15, 2017, San Francisco, USA

Brief summary

In this paper we discuss the impact of integrating security when developing a safety demonstration of a smart device. A smart device is an instrument, device or component that contains a microprocessor (and therefore contains both hardware and software) and is programmed to provide specialised capabilities, often measuring or controlling a process variable. Examples of smart devices include radiation monitors, relays, turbine governors, uninterruptible power supplies and heating ventilation, and air conditioning controllers.

Download

Deriving a frequentist conservative confidence bound for probability of failure per demand for systems with different operational and test profiles

Authors

Peter Bishop and Andrey Povyakalo

Details

Journal of Reliability Engineering & System Safety 58 (2017), pp. 246–253

Brief summary

Reliability testing is typically used in demand-based systems (such as protection systems) to derive a confidence bound for a specific operational profile. To be realistic, the number of tests for each class of demand should be proportional to the demand frequency of the class. In practice however, the actual operational profile may differ from that used during testing.

This paper provides a means for estimating the confidence bound when the test profile differs from the profile used in actual operation. Based on this analysis, the paper examines what bound can be claimed for different types of profile uncertainty and options for dealing with this uncertainty.

Download

The Risk Assessment of ERTMS-Based Railway Systems from a Cyber Security Perspective: Methodology and Lessons Learned

Authors

R Bloomfield, M Bendele, P Bishop, R Stroud, S Tonks

Details

In Proceedings of First International Conference on Reliability, Safety and Security of Railway Systems, RSSRail 2016, Paris, France, June 28-30, 2016

Brief summary

The impact that cyber issues might have on the safety and resilience of railway systems has been studied for more than five years by industry specialists and government agencies.

This paper presents some of the work done by Adelard in this area, ranging from an analysis of potential vulnerabilities in the ERTMS specifications through to a high-level cyber security risk assessment of a national ERTMS implementation and detailed analysis of particular ERTMS systems on behalf of the GB rail industry. The focus of the paper is on our overall methodology for security-informed safety and hazard analysis. Lessons learned will be presented but of course our detailed results remain proprietary or sensitive and cannot be published.

Additional information

This paper is published by Springer as a chapter in Volume 9707 of the series Lecture Notes in Computer Science. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-33951-1_1

Download

Modeling the Impact of Testing on Diverse Programs

Authors

Peter Bishop

Details

International Conference on Computer Safety, Reliability, and Security (SAFECOMP 2015), pp. 297-309

Brief summary

This paper presents a model of diverse programs that assumes there are a common set of potential software faults that are more or less likely to exist in a specific program version. Testing is modeled as a specific ordering of the removal of faults from each program version. Different models of testing are examined where common and diverse test strategies are used for the diverse program versions. Under certain assumptions, theory suggests that a common test strategy could leave the proportion of common faults unchanged, while diverse test strategies are likely to reduce the proportion of common faults. A review of the available empirical evidence gives some support to the assumptions made in the fault-based model. We also consider how the proportion of common faults can be related to the expected reliability improvement.

Download

Justifying Digital COTS Components when Compliance Cannot be Demonstrated – The Cogs Approach

Authors

S. Guerra, N. Chozos, D. Sheridan

Details

9th International conference on nuclear plant instrumentation, control & human-machine interface technologies (NPIC&HMIT 2015). Charlotte. North Carolina.

Brief summary

This paper describes a claim-based approach to the justification of COTS components (called Cogs) that was developed in a project sponsored by the UK nuclear industry. The Cogs approach is based on a set of top-level claims that remain the same for the different components but which allows for different types of evidence to be used to support specific COTS products. This allows greater flexibility in making a justification while ensuring that all safety relevant attributes of the COTS are justified.

Download

Why are I&C Modernisations So Difficult? Experiences with Requirements Engineering and Safety Demonstration in Swedish NPPs

Authors

S. Guerra, C. Menon

Details

9th International conference on nuclear plant instrumentation, control & human-machine interface technologies (NPIC&HMIT 2015). Charlotte. North Carolina.

Brief summary

Several I&C modernisation projects have encountered issues and difficulties resulting in delays and overspend. This paper describes the work we have done with the aim of identifying the main issues that have been experienced in I&C modernization projects, and any lessons learnt during these projects. For this, we conducted a number of interviews in Swedish nuclear plants, focusing on the demonstration of safety and requirements engineering. The paper discusses the findings from our interviews.

Download

Understanding, assessing and justifying I&C systems using Claims, Arguments and Evidence

Authors

S Guerra

Details

In Nuclear Safety and Simulation, Volume 5, Number 4, December 2014, pages 291-298.

Brief summary

I&C systems important to safety need to be demonstrably safe. Usually this is performed by demonstrating compliance with some relevant standards. This paper argues that compliance is not necessarily enough, and suggested using a claim-based approach to understand, assess and justify the safety of I&C systems.

Download

Building Blocks for Assurance Cases

Authors

Bloomfield, R.E. and Netkachova, K.

Details

Paper presented at the International Symposium on Software Reliability Engineering (ISSRE), 03-11-2014 - 06-11-2014, Naples, Italy.

Brief summary

The paper introduces an approach to structuring assurance cases using specially-designed CAE building blocks. The blocks are derived from an empirical analysis of the real case structures and can standardise the presentation of assurance cases by simplifying their architecture. CAE building blocks might also increase the precision and efficiency of the claims in arguments and can be used as self-contained reusable components of formal and semi-formal assurance cases.

Download

Estimating Worst Case Failure Dependency with Partial Knowledge of the Difficulty Function

Authors

Peter Bishop and Lorenzo Strigini

Details

International Conference on Computer Safety, Reliability, and Security (SAFECOMP 2014), pp. 186-201.

Brief summary

For systems using software diversity, well-established theories show that the expected probability of failure on demand (pfd) for two diverse program versions failing together will generally differ from what it would be if they failed independently. This is explained in terms of a “difficulty function” that varies be-tween demands on the system. This theory gives insight, but no specific prediction unless we have some means to quantify the difficulty function.

This paper presents a theory leading to a worst case measure of “average failure dependency” between diverse software, given only partial knowledge of the difficulty function. It also discusses the possibility of estimating the model parameters, with one approach based on an empirical analysis of previous systems implemented as logic networks, to support pre-development estimates of expected gain from diversity. The approach is illustrated using a realistic safety system example.

Download

Compliance with Standards or Claim-based Justification? The Interplay and Complementarity of the Approaches for Nuclear Software-based Systems

Authors

S. Guerra and D. Sheridan

Brief summary

In the past, safety justifications tended to be standards-based – compliance to accepted practice was deemed to imply adequate safety. Over the last 20 years, there has been a trend towards an explicit claim-based approach, where specific safety claims are sup- ported by arguments and evidence at progressively more detailed levels.

This paper discusses software-based systems with only a modest integrity requirement, and the interplay of the two approaches. It describes our experience with justifying such systems for the nuclear industry, and it claims that there are a number of benefits of taking both approaches together.

Additional information

Proceedings of the Twenty-second Safety-critical Systems Symposium, Brighton, UK, 4-6th February 2014

Download

Interpreting ALARP

Authors

C Menon, R Bloomfield, T Clement

Details

In Proceedings of the 8th IET Systems Safety Conference, 2013

Brief summary

This paper explores some of the common difficulties in interpreting the ALARP principle, and traces the potential effects of these difficulties on system risk. We introduce two categories of risk reduction approach which permit us to characterise the risk profile of a system in more detail and discuss their application to Systems of Systems (SoS).

Download

Combining testing and proof to gain high assurance in software: A case study

Authors

P Bishop, R Bloomfield, L Cyra

Details

In Proceedings of the IEEE International Symposium on Software Reliability Engineering (ISSRE 2013), 4-7 Nov 2013, Pasadena, pp. 248-257

Brief summary

There are potential benefits in combining static analysis and testing because the results obtained can be more general than standalone dynamic testing but less resource-intensive than standalone static analysis. This paper presents a specific example of this approach applied to the verification of continuous monotonic functions. This approach combines a monotonicity analysis with a defined set of tests to demonstrate the accuracy of a software function over its entire input range. Unlike “standalone” dynamic methods, our approach provides full coverage, and guarantees a maximal error.

We present a case study of the application of our approach to the analysis and testing of the software-implemented transfer function in a smart sensor. This demonstrated that relatively low levels of effort were needed to apply the approach. We conclude by discussing future developments of this approach.

Download

Security-Informed Safety: If It's Not Secure, It's Not Safe

Authors

R Bloomfield, K Netkachova, R Stroud

Details

In Proceedings of 5th International Workshop on Software Engineering for Resilient Systems (SERENE 2013), Kiev, Ukraine, Oct 2013

Brief summary

Traditionally, safety and security have been treated as separate disciplines, but this position is increasingly becoming untenable and stakeholders are beginning to argue that if it’s not secure, it’s not safe. In this paper we present some of the work we have been doing on “security-informed safety”. Our approach is based on the use of structured safety cases and we discuss the impact that security might have on an existing safety case. We also outline a method we have been developing for assessing the security risks associated with an existing safety system such as a large-scale critical infrastructure. 

Download

Does Software have to be Ultra Reliable in Safety Critical Systems?

Authors

P Bishop

Details

In Proceedings of Safecomp 2013, Toulouse, pp. 118-129, Sept 2013

Brief summary

This paper argues that higher levels of safety performance can be claimed by taking account of: 1) external mitigation to prevent an accident: 2) the fact that software is corrected once failures are detected in operation. A model based on these concepts is developed to derive an upper bound on the number of expected failures and accidents under different assumptions about fault fixing, diagnosis, repair and accident mitigation. A numerical example is used to illustrate the approach. The implications and potential applications of the theory are discussed. 

Download

HARMONICS EU FP7 Project on the Reliability Assessment of Modern Nuclear I&C Software

Authors

J Holmberg, S Guerra, N Thuy, J Martz, B Liwang

Details

In Proceedings of the 8th International Topical Meeting on Nuclear Plant Instrumentation, Control and Human-Machine Interface Technologies (NPIC & HMIT), 2012

Brief summary

This paper discusses the HARMONICS EU FP1 project on reliability assessment. 

Download

Justification of a FPGA-Based System Performing a Category C Function: Development of the Approach and Application to a Case Study

Authors

S Guerra, D Sheridan

Details

In Proceedings of the 8th International Topical Meeting on Nuclear Plant Instrumentation, Control and Human-Machine Interface Technologies (NPIC & HMIT), 2012

Brief summary

Field Programmable Gate Arrays (FPGAs) have been gaining interest in the nuclear industry for a number of years. Their simplicity compared to microprocessor-based platforms is expected to simplify the licensing approach, and therefore reduce licensing project risks compared to software based solutions. However, few safety-related applications have been licensed in the nuclear industry; those that have are typically safety applications at Category A, and work on standardizing the licensing approach has been focused on this category.

This paper presents work currently being performed on the justification of an FPGA that performs a Category C function, i.e., a function of the lowest safety category. The FPGA is part of the system monitoring vibration of the gags of the fuel assembly in one of the UK nuclear plants. Part of this work involves developing an approach for the justification which is consistent with the UK nuclear regulatory framework and commensurate with the safety category of the function performed. We draw on a number of standards, including those for software performing a function of similar criticality. However, evidence that the design and verification of the system followed a well-structured development process does not provide direct evidence that the system achieves the required behavior. Therefore, the approach also considers behavioral attributes that are important for the system, using a goal-based approach. This is complemented by a risk-informed approach, in which postulated hazards are evaluated to ensure they have been addressed and any remaining vulnerabilities of the system mitigated. 

Download

Safety Justification Frameworks: Integrating Rule-Based, Goal-Based, and Risk-Informed Approaches

Authors

P Bishop, R Bloomfield, S Guerra, N Thuy

Details

In Proceedings of the 8th International Topical Meeting on Nuclear Plant Instrumentation, Control and Human-Machine Interface Technologies (NPIC & HMIT), 2012

Brief summary

The reliability and safety of the digital I&C systems that implement safety functions are critical issues. In particular, software defects could result in common cause failures that defeat redundancy and defence-in-depth mechanisms. Unfortunately, the differences in current safety justification principles and methods for digital I&C restrict international co-operation and hinder the emergence of widely accepted best practices. These differences also prevent cost sharing and reduction, and unnecessarily increase licensing uncertainties, thus creating a very difficult operating environment for utilities, vendors and regulatory bodies. The European project HARMONICS (Harmonised Assessment of Reliability of MOdern Nuclear I&C Software) is seeking to develop a more harmonised approach to the justification of software-based I&C systems important to safety.

This paper outlines the justification framework we intend to develop in HARMONICS. It will integrate three strategies commonly used in safety justifications of I&C system and its software: rule based evidence of compliance to accepted standards; goal-based evidence that the intended behaviour and other claimed properties has been achieved; and risk-informed evidence that unintended behaviour is unlikely. The paper will present general forms of safety case that can be adapted to a variety of specific topics. 

Download

Toward a Formalism for Conservative Claims about the Dependability of Software-Based Systems

Authors

PG Bishop, RE Bloomfield, B Littlewood, A Povyakalo, DR Wright

Details

In Proceedings of IEEE Transactions on Software Engineering, pp 708-717, Vol. 37, No. 5, Sept/Oct 2011

Brief summary

Here, we consider a simple case where an expert makes a claim about the probability of failure on demand (pfd) of a subsystem of a wider system and is able to express his confidence about that claim probabilistically. An important, but difficult, problem then is how such subsystem (claim, confidence) pairs can be propagated through a dependability case for a wider system, of which the subsystems are components. An informal way forward is to justify, at high confidence, a strong claim, and then, conservatively, only claim something much weaker: e.g. if I am 99 percent confident that the pfd is less than 0.00001 it is reasonable to be 100 percent confident that it is less than 0.001.In this paper, we provide formal support for such reasoning.

Download

Diversity for Security: a Study with Off-The-Shelf AntiVirus Engines

Authors

P Bishop, R Bloomfield, I Gashi, V Stankovic

Details

In Proceedings of ISSRE 2011, Hiroshima, Japan

Brief summary

In this paper we present an emprical analysis using a known set of software viruses to explore the detection gains that can be achieved from using more diversity (i.e. more than two AntiVirus products), how diversity may help to reduce the “at risk time” of a system and a preliminary model-fitting using the hyper-exponential distribution. 

Download

Assessment and Qualification of Smart Sensors

Authors

S Guerra, P Bishop, R Bloomfield, D Sheridan

Details

In Proceedings NPIC/HMIT 2010, Las Vegas, USA, 2010

Brief summary

This paper describes research work done on approaches to justifying smart instruments, and in particular, how some of this research has successfully been applied to the safety substantiation of such instruments. 

Download

Overcoming Non-determinism in Testing Smart Devices: A Case Study

Authors

PG Bishop, L Cyra

Details

In Proceedings SAFECOMP 2010, Vienna, pp. pp 237-250, 2010

Brief summary

Non-determinism can arise due to inaccuracy in an analogue measurement made by the device when two alternative actions are possible depending on the measured value. This non-determinism makes it difficult to predict the output values that are expected from a test sequence of analogue input values. The paper presents two approaches to dealing with this difficulty: (1) based on avoidance of test values that could have multiple responses, (2) based on consideration of all possible interpretations of input data. 

Download

An Approach to Using Non Safety-Assured Programmable Components in Modest Integrity Systems

Authors

PG Bishop, N Chozos, K Tourlas

Details

In Proceedings SAFECOMP 2010, Vienna, pp. 377–390, 2010

Brief summary

There is a problem in justifying the use of programmable components if the components have not been safety justified to an appropriate integrity (e.g. to SIL 1 of IEC 61508). This paper outlines an approach (called LowSIL) developed in the UK CINIF nuclear industry research programme to justify the use of non safety-assured programmable components in modest integrity systems. 

Download

Infrastructure interdependency analysis: Introductory research review

Authors

R. Bloomfield, N. Chozos, and P. Nobles

Details

Adelard document reference: D/422/12101/4, 2009.

Brief summary

This paper presents an introductory review of research in infrastructure interdependency modelling and analysis. In particular, it focuses on network models, interdependency analysis, infrastructure models, simulation under federation and visualization. 

Download

Infrastructure interdependency analysis: Requirements, capabilities and strategy

Authors

R. Bloomfield, N. Chozos, and P. Nobles

Details

Adelard document reference: D/418/12101/3, issue 1. 2009.

Brief summary

This paper aims at assessing the technical and commercial feasibility of the development of tools and services for analysing interdependency between infrastructures, particularly information infrastructures, and assessing associated risks, as well as establishing “interdependency analysis” as a distinct and recognisable service supported by tools and data. 

Download

Reliability Modeling of a 1-Out-Of-2 System: Research with Diverse Off-The-Shelf SQL Database Servers

Authors

P Bishop, I Gashi, B Littlewood, D Wright

Details

In Proceedings of the 18th IEEE International Symposium on Software Reliability Engineering (ISSRE 2007). 5-9 of November, Trollhättan, Sweden. 2007, pp. 49-58

Brief summary

This paper discusses two methods for modelling the reliability growth of a fault-tolerent database constucted from diverse database servers. 

Download

Measuring Hazard Identification

Authors

P R Caseley, Sofia Guerra and Peter Froome

Details

In Proceedings of the 1st IET International Conference on System Safety, pp.23-28, 6-8 June 2006, London, UK.

Brief summary

This paper discusses an experiment that measured the effectiveness of a hazard identification process used to support safety in Defence Standard 00-56 project. The experimental case study utilised a Ministry of Defence project that assessed simultaneously two potential suppliers who were competing for a MOD equipment contract. The UK MOD Corporate Research Programme funded the comparison work and the MOD Integrated Project Team funded the project which included each contractor's project safety processes. 

Download

Justification of smart sensors for nuclear applications

Authors

Peter Bishop, Robin Bloomfield, Sofia Guerra and Kostas Tourlas.

Details

In Proceedings SAFECOMP 2005, 28-30 September, Fredrikstad, Norway, 2005 (c) Springer Verlag.

Brief summary

This paper describes the results of a research study sponsored by the UK nuclear industry into methods of justifying smart sensors. Smart sensors are increasingly being used in the nuclear industry; they have potential benefits such as greater accuracy and better noise filtering, and in many cases their analogue counterparts are no longer manufactured. However, smart sensors (as it is the case for most COTS) are sold as black boxes despite the fact that their safety justification might require knowledge of their internal structure and development process. The study covered both management aspects of interacting with manufacturers to obtain the information needed, and the technical aspects of designing an appropriate safety justification approach and assessing feasibility of a range of technical analyses. The analyses performed include the methods we presented at Safecomp 2002 and 2003. 

Download

Independent Safety Assessment of Safety Arguments

Authors

Peter Froome

Details

In Proceedings Safety-critical Systems Symposium, Southampton, UK, 8-10 February 2005 © Springer-Verlag

Brief summary

The paper describes the role of independent Safety Auditor (ISA) as carried out at the present in the defence and other sectors in the UK. It outlines the way the ISA role has developed over the past 15–20 years with the changing regulatory environment. The extent to which the role comprises audit, assessment or advice is a source of confusion, and the paper clarifies this by means of some definitions, and by elaborating the tasks involved in scrutinising the safety argument for the system. The customers and interfaces for the safety audit are described, and pragmatic means for assessing the competence of ISAs are presented.

Download

Software and SILS

Authors

P.G. Bishop

Details

Safety Critical Systems Club Newsletter, Jan 2005

Brief summary

This short article for the UK Safety Critical Systems Club Newsletter suggests an alternative interpretation of the SIL concept for software. 

Download

Application of a Commercial Assurance Case Tool to Support Software Certification Services

Authors

Luke Emmet, Sofia Guerra

Details

SoftCeMent 05 (Software Certificate Management 2005) workshop at the 20th IEEE/ACM International Conference on Automated Software Engineering

Brief summary

This short paper for the SoftCeMent 05 workshop presents an approach to delivering a range of software certification processes based on the commercial assurance case tool, ASCE. 

Download

An Exploration of Software Faults and Failure Behaviour in a Large Population of Programs

Authors

M.J.P. van der Meulen, P.G. Bishop and M. Revilla

Details

ISSRE 04, St Malo, France, 2-5 Nov 2004

Brief summary

A large part of software engineering research suffers from a major problem---there are insufficient data to test software hypotheses, or to estimate parameters in their models. To obtain statistically significant results, a large set of programs is needed, each set comprising many programs built to the same specification. We have gained access to such a large body of programs (written in C, C++, Java or Pascal) and in this paper we present the results of an exploratory analysis of around 29\thinspace000 C programs written to a common specification.

The objectives of this study were to:

  • characterise the types of fault that are present in these programs
  • characterise how programs are debugged during development
  • assess the effectiveness of diverse programming.

The findings are discussed, together with the potential limitations on the realism of the findings.

Download

An Empirical Exploration of the Difficulty Function

Authors

Julian G W Bentley, Peter G Bishop, Meine van der Meulen

Details:SAFECOMP 21.-24. Sep. 2004, Potsdam, Germany, pp. 60-71

Brief summary

The theory developed by Eckhardt and Lee (and later extended by Littlewood and Miller) utilises the concept of a "difficulty function" to estimate the expected gain in reliability of fault tolerant architectures based on diverse programs. The "difficulty function" is the likelihood that a randomly chosen program will fail for any given input value. To date this has been an abstract concept that explains why dependent failures are likely to occur.

This paper presents an empirical measurement of the difficulty function based on an analysis of over six thousand program versions implemented to a common specification. The study derived a "score function" for each version. It was found that several different program versions produced identical score functions, which when analysed, were usually found to be due to common programming faults. The score functions of the individual versions were combined to derive an approximation of the difficulty function. For this particular (relatively simple) problem specification, it was shown that the difficulty function derived from the program versions was fairly flat, and the reliability gain from using multi-version programs would be close to that expected from the independence assumption.

Download

The future of goal-based assurance cases

Authors

P.G. Bishop, Robin Bloomfield, Sofia Guerra

Details

In Proceedings of Workshop on Assurance Cases. Supplemental Volume of the 2004 International Conference on Dependable Systems and Networks, pp. 390-395, Florence, Italy, June 2004.

Brief summary

Most regulations and guidelines for critical systems require a documented case that the system will meet its critical requirements, which we call an assurance case. Increasingly, the case is made using a goal-based approach, where claims are made (or goals are set) about the system and arguments and evidence are presented to support those claims. In this paper we describe Adelard's approach to safety cases in particular, and assurance cases more generally, and discuss some possible future directions to improve frameworks for goal-based assurance cases.

Download

Estimating PLC logic program reliability

Authors

P.G. Bishop

Details

Safety Critical Systems Symposium Birmingham, 17th-19th February, 2004

Brief summary

This paper applies earlier theoretical work to an industrial PLC logic example. This study required extensions to the previous to estimate the number of residual logic faults (N). and we show that the worst case bound theory is applicable. 

Download

MC/DC based estimation and detection of residual faults in PLC logic networks

Authors

P.G. Bishop

Details

In Supplementary Proceedings fourteenth International Symposium on Software Reliability Engineering (ISSRE '03), Fast Abstracts, pp. 297-298, 17-20 November, Denver, Colorado, USA, 2003 (c) IEEE

Brief summary

Coverage measurement has previously been used to estimate residual faults in conventional program code. The basic idea is that the relationship between code covered and faults found is nearly linear, so it is possible to estimate the number of residual faults from the proportion of uncovered code. In this paper we apply the same concept to PLC logic networks rather than conventional program code, combined with a random test strategy designed to maximize coverage growth. This proved to be very efficient in detecting the known faults in an industrial logic example 

Download

Using a Log-normal Failure Rate Distribution for Worst Case Bound Reliability Prediction

Authors

P.G. Bishop, R.E. Bloomfield

Details

In Proceedings fourteenth International Symposium on Software Reliability Engineering (ISSRE '03), pp. 237-245, 17-20 November, Denver, Colorado, USA, 2003 (c) IEEE

Brief summary

Prior research has suggested that the failure rates of faults follow a log normal distribution. We propose a specific model where distributions close to a log normal arise naturally from the program structure. The log normal distribution presents a problem when used in reliability growth models as it is not mathematically tractable. However we demonstrate that a worst case bound can be estimated that is less pessimistic than our earlier worst case bound theory.

Download

Integrity Static Analysis of COTS/SOUP

Authors

P.G. Bishop, R.E. Bloomfield, T.P. Clement, A.S.L. Guerra and C.C.M. Jones

Details

In Proceedings SAFECOMP 2003, pp. 63-76, 21-25 Sep, Edinburgh, UK, 2003, (c) Springer Verlag

Brief summary

This paper describes the integrity static analysis approach developed to support the justification of commercial off-the-shelf software (COTS) used in a safety-related system. The static analysis was part of an overall software qualification programme, which also included the work reported in our paper presented at Safecomp 2002. The analysis addressed two main aspects: the internal integrity of the code (especially for the more critical functions), and the intra-component integrity, checking for covert channels. The analysis process was supported by an aggregation of tools, combined and engineered to support the checks done and to scale as necessary. Integrity static analysis is feasible for industrial scale software, did not require unreasonable resources and we provide data that illustrates its contribution to the software qualification programme. 

Download

Learning from incidents involving E/E/PE systems

Authors

P.G. Bishop, R.E. Bloomfield. L.O.Emmet

Details

In Proceedings Thirteenth International Symposium on Software Reliability Engineering (ISSRE '02), November 12-15, Annapolis, Maryland, USA, 2002(c) IEEE

Brief summary

The UK Health and Safety Executive (HSE) commissioned a research study into methods of learning from incidents involving electrical, electronic and programmable elactronic systems (E/E/PES). The approach is designed to comply with the IEC 61508 standard and to be suitable for organisations at different levels of maturity.

The three reports resulting from this work can be downloaded from the HSE web site:

Worst Case Reliability Prediction Based on a Prior Estimate of Residual Defects

Authors

P.G. Bishop, R.E. Bloomfield

Details

In Proceedings Thirteenth International Symposium on Software Reliability Engineering (ISSRE '02), November 12-15, Annapolis, Maryland, USA, 2002(c) IEEE

Brief summary

In this paper we extend an earlier worst case bound reliability theory to derive a worst case reliability function R(t), which gives the worst case probability of surviving a further time t given an estimate of residual defects in the software and a prior test time T. The earlier theory and its extension are presented and the paper also considers the case where there is a low probability of any defect existing in the program. The implications of the theory are discussed and compared with alternative reliability models.

Download

Estimating Residual Faults from Code Coverage

Authors

P.G. Bishop

Details

In Proceedings SAFECOMP 2002, 10-13 September, Catania, Italy, 2002, (c) Springer Verlag

Brief summary

Many reliability prediction techniques require an estimate for the number of residual faults. In this paper, a new theory is developed for using test coverage to estimate the number of residual faults. This theory is applied to a specific example with known faults and the results agree well with the theory. The theory is used to justify the use of linear extrapolation to estimate residual faults. It is also shown that it is important to establish the amount of unreachable code in order to make a realistic residual fault estimate.

Download

Rescaling Reliability Bounds for a New Operational Profile

Authors

P.G. Bishop

Details

In Proceedings, International Symposium on Software Testing and Analysis (ISSTA 2002), ACM Software Engineering Notes, Vol 27 No. 4, pp 180-190, Rome, Italy, 22-24 July, 2002, (c) ACM

Brief summary

One of the main problems with reliability testing and prediction is that the result is specific to a particular operational profile. This paper extends an earlier reliability theory for computing a worst case reliability bound. The extended theory derives a re-scaled reliability bound based on the change in execution rates of the code segments in the program. In some cases it is possible to derive a maximum failure rate bound that applies to any change in the profile. It also predicts that (in principle) a fair test profile can be derived where the reliability bounds are relatively insensitive to the operational profile. In addition the theory allows unit and module test coverage measures to be incorporated into an operational reliability bound prediction. The implications of the theory are discussed, and the theory is evaluated by applying it to two example programs with known faults.

Download

Learning from incidents involving electrical/ electronic/ programmable electronic safety-related systems. Project outline.

Authors

Mark Bowell (HSE), George Cleland & Luke Emmet

Details

Workshop paper for Workshop on the Investigation and Reporting of Incidents and Accidents (IRIA) 17th - 20th July 2002, The Senate Room, University of Glasgow

Brief summary

The UK Health and Safety Executive (HSE) has initiated a programme of work that will eventually provide guidance for those responsible on how to learn from their own incident data; a means for HSE to ensure that it has the best information attainable on incidents involving electrical/ electronic/ programmable electronic (E/E/PE) safety-related systems; and a stimulus to industry.

HSE has contracted a consortium, led by Adelard and also involving the Glasgow (University) Accident Analysis Group (GAAG) and Blacksafe Consulting, to carry out a 7-month interactive project that will:

  • identify and evaluate existing schemes for classifying causes from incident data and generating lessons to avoid recurrence of similar incidents
  • select and modify an existing scheme or schemes, or derive a new one, in order to create a method for analysing and classifying incident data to match the principles and activities of IEC 61508
  • test the new method using data from a small number of real incidents
  • identify and present the significant strengths and weaknesses of the proposed method and how it fits in with wider issues such as incident reporting, incident investigation and process improvement.

This project is part of HSE's longer-term programme to provide best advice in this field. The paper provides an outline of the project.

Download

Graphical Notations, Narratives and Persuasion: a Pliant Systems Approach to Hypertext Tool Design

Authors

Luke Emmet and George Cleland

Details

In Proceedings of ACM Hypertext 2002 (HT'02), June 11-15, 2002, College Park, Maryland, USA

Brief summary

The Adelard Safety Case Editor (ASCE) is a hypertext tool for constructing and reviewing structured arguments. ASCE is used in the safety industry, and can be used in many other contexts when graphical presentation can make argument structure, inference or other dependencies explicit. ASCE supports a rich hypertext narrative mode for documenting traditional argument fragments.

In this paper we document the motivation for developing the tool and describe its operation and novel features. Since usability and technology adoption issues are critical for software and hypertext tool uptake, our approach has been to develop a system that is highly usable and sufficiently "pliant" to support and integrate with a wide range of working practices and styles. We discuss some industrial application experience to date, which has informed the design and is informing future requirements. We draw from this some of the perhaps not so obvious characteristics of hypertext tools which are important for successful uptake in practical environments.

Download

Process Modelling to Support Dependability Arguments

Authors

Robin Bloomfield and Sofia Guerra

Details

Process modelling to support dependability arguments. In Proceedings of the International Conference on Dependable Systems and Networks, DSN 2002, Washington, DC, USA, June 2002.

Brief summary

This paper reports work to support dependability arguments about the future reliability of a product before there is direct empirical evidence. We develop a method for estimating the number of residual faults at the time of release from a "barrier model" of the development process, where in each phase faults are created or detected. These estimates can be used in a conservative theory in which a reliability bound can be obtained or can be used to support arguments of fault freeness. We present the work done to demonstrate that the model can be applied in practice. A company that develops safety-critical systems provided access to two projects as well as data over a wide range of past projects. The software development process as enacted was determined and we developed a number of probabilistic process models calibrated with generic data from the literature and from the company projects. The predictive power of the various models was compared.

Download

The Practicalities of Goal-Based Safety Regulation

Authors

J Penny, A Eaton CAA (SRG), PG Bishop, RE Bloomfield (Adelard)

Details

Aspects of Safety Management: Proceedings of the Ninth Safety-Critical Systems Symposium Bristol, UK, 6-8 February 2001, Felix Redmill and Tom Anderson (eds.) London; New York: Springer, 2001 ISBN: 1-85233-411-8, pages 35-48

Brief summary

"Goal-based regulation" does not specify the means of achieving compliance but sets goals that allow alternative ways of achieving compliance, e.g. "People shall be prevented from falling over the edge of the cliff". In "prescriptive regulation" the specific means of achieving compliance is mandated, e.g. "You shall install a 1 meter high rail at the edge of the cliff". There is an increasing tendency to adopt a goal-based approach to safety regulation, and there are good technical and commercial reasons for believing this approach is preferable to more prescriptive regulation. It is however important to address the practical problems associated with goal-based regulation in order for it to be applied effectively.

This paper discusses the motivation for adopting a goal-based regulatory approach, and then illustrates the implementation by describing SW01 which forms part of the CAP 670 regulations for ground-based air traffic services (ATS). The potential barriers to the implementation of such standards together are discussed, together with methods for addressing such barriers.

Download

Use of SOUP in safety related applications

Brief summary

The UK Health and Safety Executive (HSE) recently commissioned research from Adelard into how pre-existing software components may be safely used in safety-related programmable electronic systems in a way that complies with the IEC 61508 standard. Two reports resulted from this work and are now published on the HSE web site:

The first report summarises the evidence that is likely to be available in practice relating to a software component to assist in assessing the safety integrity of a safety function that depends on that component.

The second report considers how the available evidence can best be used within the framework of the IEC 61508 safety lifecycle to support an argument for the safety integrity achieved by a safety function.

The REVERE project: experiments with the application of probabilistic NLP to systems engineering

Authors

Paul Rayson, Luke Emmet, Roger Garside and Pete Sawyer, 2000

Details

The REVERE Project: Experiments with the application of probabilistic NLP to Systems Engineering. In Bouzeghoub, M., Kedad, Z., and Metais, E. (eds.) Natural Language Processing and Information Systems. 5th International Conference on Applications of Natural Language to Information Systems (NLDB'2000). Versailles, France, June 2000. Revised papers. LNCS 1959. - Springer-Verlag, Berlin Heidelberg, pp. 288 - 300. ISBN 3-540-41943-8.

Brief summary

Despite natural language's well-documented shortcomings as a medium for precise technical description, its use in software-intensive systems engineering remains inescapable. This poses many problems for engineers who must derive problem understanding and synthesise precise solution descriptions from free text. This is true both for the largely unstructured textual descriptions from which system requirements are derived, and for more formal documents, such as standards, which impose requirements on system development processes. This paper describes experiments that we have carried out in the REVERE project to investigate the use of probabilistic natural language processing techniques to provide systems engineering support.

Download

The Development of a Commercial 'Shrink-Wrapped Application' to Safety Integrity Level 2: The DUST-EXPERT™ Story

Authors

Tim Clement, Ian Cottam, Peter Froome and Claire Jones

Details

Safecomp'99, Toulouse, France, Sept 1999. In Lecture Notes in Computer Science 1698, Springer 1999. ISBN 3-540-66488-, © Springer Verlag

Brief summary

We report on some of the development issues of a commercial "shrink-wrapped application" - DUST-EXPERT™ - that is of particular interest to the safety and software engineering community. Amongst other things, the following are reported on and discussed: the use of formal methods; advisory systems as safety related systems; safety integrity levels and the general construction of DUST-EXPERT's safety case; statistical testing checked by an "oracle" derived from the formal specification; and our achieved productivity and error density.

Download

Requirements for a Guide on the Development of Virtual Instruments

Authors

Luke Emmet and Peter Froome

Details

In Proceedings NMC 99: National Measurement Conference 99, Brighton, UK. © Adelard 1999

Brief summary

Adelard is producing a good-practice guide and training course on the development of virtual instruments as part of the DTI's Software Support for Metrology programme. This paper describes our requirements capture process and presents some of the principal issues that are emerging.

Download

The Formal Development of a Windows Interface

Authors

T. Clement

Details

3rd Northern Formal Methods Workshop, September 1998, Ilkley, UK., © Springer Verlag

Brief summary

This paper describes an approach to the use of the formal method VDM in the design and implementation of Microsoft Windows™ interfaces. This approach evolved during the development of Dust-Expert™, a Windows-based system for providing design advice on the prevention and control of dust explosions, developed for the Health and Safety Executive (HSE). The approach we have adopted is deliberately conservative: we have aimed to see how we can take guidance in the design of the system from the standard Vienna Development Method rather than inventing new language constructs or new proof obligations. One advantage of this is that we can continue to use the tools that are available for supporting the standard language.

Download

A Methodology for Safety Case Development

Authors

P G Bishop and R E Bloomfield

Details

Safety-critical Systems Symposium, Birmingham, UK, Feb 1998, © Adelard

Brief summary

A safety case is a requirement in many safety standards for computer systems and it is important that an adequate safety case is produced. In regulated industries such as the nuclear industry, the need to demonstrate safety to a regulator can be a major commercial risk. This paper outlines a safety case methodology that seeks to minimise safety risks and commercial risks by constructing a demonstrable safety case. The safety case ideas presented here were initially developed in a European and UK research programmes and have subsequently been applied in industry. To implement the safety case we advocate the integration of safety case development into the design process so that the costs and risks of the associated safety case can be included in the design trade-offs. We propose a layered structure for the safety case that allows the safety case to evolve over time and helps to establish the safety requirements at each level. For large projects with sub-contractors, this "top-down" safety case approach helps to identify the subsystem requirements and the subsystem safety case can be made an explicit contractual requirement to be delivered by the sub-contractor.

Download

Using Reversible Computing to Achieve Fail-safety

Authors

P G Bishop

Details

ISSRE 97, Nov 1997, Alberquerque, New Mexico, USA., © IEEE Computer Society Press

Brief summary

This paper describes a fail-safe design approach that can be used to achieve a high level of fail-safety with conventional computing equipment which may contain design flaws.

The method is based on the well-established concept of "reversible computing". Conventional programs destroy information and hence cannot be reversed. However it is easy to define a virtual machine that preserves sufficient intermediate information to permit reversal. Any program implemented on this virtual machine is inherently reversible. The integrity of a calculation can therefore be checked by reversing back from the output values and checking for the equivalence of intermediate values and original input values. By using different machine instructions on the forward and reverse paths, errors in any single instruction execution can be revealed.

Random corruptions in data values are also detected. An assessment of the performance of the reversible computer design for a simple reactor trip application indicates that it runs about ten times slower than a conventional software implementation and requires about 20 kilobytes of additional storage. The trials also show a fail-safe bias of better than 99.998% for random data corruptions, and it is argued that failures due to systematic flaws could achieve similar levels of fail-safe bias. Potential extensions and applications of the technique are discussed.

Download

Viewpoints on Improving the Standards Making Process: Document Factory or Consensus Management?

Authors

L O Emmet

Details

ISSES 97, Walnut Creek

Brief summary

Emerging standards and guidelines need to be timely and reflect the requirements of the industrial sector they are designed to support. However, often, the delay between the identification of a need for a standard and its eventual release is too long. There is a need for increased understanding of the sources of delay and deadlock within the standards process.

In this paper we describe an application of PERE (Process Evaluation in Requirements Engineering) to the standards process. PERE provides an integrated process analysis that identifies improvement opportunities by considering process weaknesses and protections from both mechanistic and human factors viewpoints. The resulting analysis identified both classical resource allocation problems and also specific problems concerning the construction and management of consensus within a typical standards making body. A number of process improvement opportunities are identified that could be implemented to improve the standards process. We conclude that consensus problems are the real barrier to timely standards production. Ironically the present trend for more distributed working and electronic support (via email etc.) may make the document factory aspect of standards production more efficient at the expense of consensus building.

Download

PERE: Evaluation and Improvement of Dependable Processes

Authors

Robin Bloomfield, John Bowers, Luke Emmet, Stephen Viller

Details

Safecomp 96, Vienna, Oct 96, Springer Verlag. © Springer Verlag

Brief summary

In the development of systems that have to be dependable, weaknesses in the requirements engineering (RE) process are highly undesirable. Such weaknesses may either introduce undetected system weaknesses, or otherwise significant costs may arise in their correction later in the development process. Typically, the RE process contains a number of individual and group activities and thus is particularly subject to weaknesses arising from human factors.

Our work has concerned the development of PERE (Process Evaluation in Requirements Engineering), which is a structured method for analysing processes for weaknesses and proposing process improvements against them. PERE combines two complementary viewpoints within its process evaluation approach. Firstly, a classical engineering analysis is used for process modelling and generic process weakness identification. This initial analysis is fed into the second analysis phase, in which those process components that are primarily composed of human activity, their interconnections and organisational context are subject to a systematic human factors analysis. In this paper we briefly describe PERE and provide examples of the application experience to date.

Download

A Conservative Theory for Long-Term Reliability Growth Prediction

Authors

P G Bishop and R E Bloomfield

Details

ISSRE 96, Oct 1996, White Plains, NY, USA (see also IEEE Trans. Reliability, Dec 1996), © IEEE Computer Society Press

Brief summary

While existing reliability growth theories employ a wide range of underlying models, the basic strategy is the same: to extrapolate future reliability from past failures. This approach works reasonably successfully over the short term but lacks predictive power over the long term (i.e. for usage times which are orders of magnitude greater than the current usage time).

This paper describes a different approach to reliability growth modelling which should enable conservative long term predictions to be made. Using relatively standard assumptions it is shown that the expected value of the failure rate after a usage time T has an upper bound of N/eT where N is the initial number of faults and e is the exponential constant. This is conservative since it places a worst case bound on the reliability rather than making a best estimate. It is shown that less pessimistic results can be obtained if additional assumptions are made about the distribution of failure rates over the N faults.

We also show that the predictions might be relatively insensitive to assumption violations over the longer term. The theory offers the potential for making long term software reliability growth predictions based solely on prior estimates of the number of residual faults (e.g. using the program size and other software development metrics). Some empirical evaluations of the theory have been made using a range of industrial and experimental reliability data and the results appear to agree with the predicted bound.

Download

Data Reification Without Explicit Abstraction Functions

Authors

T. Clement

Details

FME'96, March 1996, Oxford, UK, © Springer Verlag

Brief summary

Data reification in VDM normally involves the explicit positing of an abstraction function with certain properties. However, the condition for one definition to reify another only requires that a function with such properties should exist. This suggests that it may be possible to carry through a data reification without giving an explicit definition of the abstraction function at all. This paper explores this possibility and compares it with the more conventional approach.

Download

The SHIP Safety Case

Authors

P G Bishop and R E Bloomfield

Details

SafeComp 95, Proc. 14th IFAC Conf. on Computer Safety, Reliability and Security (ed. G. Rabe), Belgirate, Italy, 11-13 October 1995, Springer, ISBN 3-540-19962-4., © Adelard

Brief summary

This paper presents a safety case approach to the justification of safety-related systems. It combines methods used for handling software design faults with approaches used for hazardous plant. The general structure of the safety argument is presented together with the underlying models for system failure that can be used as the basis for quantified reliability estimates. The approach is illustrated using plant and computer based examples.

Download

Software Fault Tolerance by Design Diversity

Authors

P G Bishop

Details

Software Fault Tolerance (ed. M. Lyu), Wiley, USA, 1995, © Wiley Press

Brief summary

N-version programming is vulnerable to common faults. It was thought that the primary source of common faults arose from ambiguities and omissions in the specification but the Knight and Leveson experiment showing that failure independence of design faults cannot be assumed. This result is backed up by later experiments and qualitative evidence from other experiments. In addition an "error masking" mechanism that will cause failure dependency in almost all programs.

This catalogue of problems may paint too gloomy a picture of the potential for N-version programming, because: back-to-back testing can certainly help to eliminate design faults, and failure dependency only arise if a majority of versions are faulty. For small applications developed with good quality controls, the probability of having multiple design faults can be quite low so N-version programming can be a useful safeguard against residual design faults.

Download

Stepwise Development and Verification of a Boiler System Specification

Authors

Bishop, P.G., Bruns, G., Anderson S.O.

Details

International Workshop on the Design and Review of Software Controlled Safety-related Systems, National Research Council, Ottawa, Canada, June 28-29, 1993. © Adelard

Brief summary

In attempting to demonstrate the safety of the Generic Boiler System, two main problems are faced. First, there are a wide range of possible failures that can occur. For example, the physical devices themselves can fail, sensors can fail, and sensed values can be delayed or lost in transmission. Taking careful account of all possible failures is difficult. A second problem, common to all safety-critical systems, is that absolute safety cannot be shown. One can only hope to demonstrate partial or probable safety. However, estimates of the probability of safety are hard to calculate, and it is hard to know whether one can place much confidence in them. The approach demonstrated here addresses both of these issues.

Our report has two parts. In Part I, the technique of step-wise elaboration of the boiler controller is demonstrated. In Part II, verification of safety and failure properties is shown for a boiler system model developed at a late step of elaboration.

Download

The Variation of Software Survival Times for Different Operational Input Profiles

Authors

Bishop, P.G.

Details

FTCS-23, Toulouse, June 22-24, 1993, IEEE Computer Society Press, ISBN 0-8186-3680-7, © IEEE Computer Society Press

Brief summary

This paper provides experimental and theoretical evidence for the existence of contiguous failure regions in the program input space ("blob" defects). For real-time systems where successive input values tend to be similar, blob defects can have a major impact on the software survival time because the failure probability is not constant. For example, with a "random walk" input sequence, the probability of failure decreases as the time from the last failure increases. It is shown that the key factors affecting the survival time are the input "trajectory", the rate of change of the input values and the "surface area" of the defect (rather than its volume). It is shown that large defects can exhibit very long mean times to failure when the rate of change of input values is decreased.

Download