Reidar Conradi (Ed.):

Mini-glossary of software quality terms, with emphasis on safety

Version 3.0 of 4 June 2007, IDI, NTNU

Compiled by Reidar Conradi, Dept. of Computer and Information Science (IDI), Software Engineering (SU) Group – after a draft of PhD student Torgrim Lauritsen, IDI.

Initial comments:

§ Some of the terms have alternative sources for their definitions, marked by “Def.1)… Def.2)…”. If there is only one source, the “Def.1)” prefix is omitted. For some terms, the same source also has provided alternative definitions, marked as “1)… 2)…”.

§ Comment: … are used for clarification.

§ The source of each definition is sought given, usually from some standards document such as [IEEE SESC] and [ISO terms], or from an acknowledged textbook or overview paper such as [Leveson95] or [Avizienis2004].

§ There may not be consensus about a term’s name and meaning, and some old definitions really look archaic.

Accident Def.1) An undesired and unplanned (but not necessarily unexpected) event that results in (at least) a specified level of loss [Leveson95]. Def.2) An unplanned event or series of events that results in death, injury, illness, environmental damage, or damage to or loss of equipment or property [IEEE 1228].

Availability Def.1) The degree to which a system or component is operational and accessible when required for use [IEEE 610.12]. Def.2) Readiness for correct service [Avizienis2004]. Comment: reliability then means, that the requested functionality keeps staying available. Often expressed as the probability of being “on-line” or ready.

Business-critical That core computer and other support systems of a business have sufficient QoS to preserve the stability of the business [indirectly after Sommerville04].

Business-safe Def.1) That “relevant issues like reputation, employment practices, intellectual property, competition, supply chains, fraud and data security need to be considered” [Jolly03]. Def.2) That core computer and other support systems of a business are sufficiently safe, i.e. do not threaten the stability of the business [Own definition after Sommerville04]. Comment: subset of business-critical.

Computer system A system containing one or more computers and associated software [IEEE 610.12]. Comment: i.e. an information processing (or ICT) system.

Confidentiality Absence of unauthorized disclosure of information [Avizienis2004].

Cost Sum of all expenses by making a piece of software or an entire computer system [Own definition].

Dependability The trustworthiness of a computing system which allows reliance to be justifiably placed on the service it delivers [Avizienis01]. Dependability is an integrating concept that encompasses the following attributes:

· Availability: readiness for correct service;

· Reliability: continuity of correct service;

· Safety: absence of catastrophic consequences on the user(s) and

the environment;

· Security: the concurrent existence of (a) availability for authorized

users only, (b) confidentiality, and (c) integrity.

In the later [Avizienis04], security is split off as a separate quality, and

dependability is rephrased as:

· Availability: readiness for correct service;

· Reliability: continuity of correct service;

· Safety: absence of catastrophic consequences on the user(s) and

the environment;

· Integrity: absence of improper system alterations;

· Maintainability: ability to undergo modifications and repairs.

Comment: How to measure dependability? Not defined in IEEE 610.12!

Efficiency The degree to which a system or component performs its designated functions with minimum consumption of resources [IEEE 610.12].

Error Def.1) That at least one (or more) internal state of the system deviates from the correct service state. The adjudged or hypothesized cause of an error is called a fault. In most cases, a fault first causes an error in the service state of a component that is a part of the internal state of the system and the external state is not immediately affected. … many errors do not reach the system’s external state and cause a failure [Avizienis04]. Def.2) The difference between a computed, observed, or measured value or condition and the true, specified, or theoretically correct value or condition. For example, a difference of 30 meters between a computed result and the correct result [IEEE 610.12]. Def.3) Any detected deviation between specification and implementation/expected-result [Popular usage], see below. Comment-1: consider the three most common usages of “error” by Google-based rankings, taken from http://www.softwaredevelopment.ca/bugs.shtml:

§ “An error has occurred” (~75,000 pages, vs. 68 for “defect”),

§ “Unknown error” (~57,700 pages, vs. 478 for “defect”), and

§ “Unrecoverable error” (~26,900 pages, vs. 3 for “defect”).

Comment-2: see explanation under fault. Another term: “active” error. Conclusion: total chaos in terminology, so try to avoid the term “error”.

Failure Def.1) The non-performance or inability of the system or component to perform its intended function for a specified time under specified environmental conditions. A failure may be caused by design flaws – the intended, designed and constructed behaviour does not satisfy the system goal [Leveson95]. Def.2) The inability of a system or component to perform its required function within specified performance requirements [IEEE 610.12]. Def.3) Since a service is a sequence of the system’s external states, a service failure means that at least one (or more) external state of the system deviates from the correct service state [Avizienis04]. Comment: Probability of failure = 1 – Pr(Reliability). Other terms: malfunction, “external-visible” error.

Fault 1) A defect in a hardware device or component; for example, a short circuit or a broken wire. 2) An incorrect step, process, or data definition in a computer program [IEEE 610.12]. Shared comment for Fault, Error and Failure: … Faults can be internal or external of a system. … [taken from Avizienis04]. Contextual (dynamic) execution of a dormant (static) fault usually leads to internal error, and possibly later an external failure. The “incorrectness” of a fault means that it violates stated functional requirements. Other terms: “passive” or “dormant” error, defect, bug (but try to avoid the last one).

FMEA Failure Mode and Effects Analysis (FMEA) is a risk assessment technique for systematically identifying potential failures in a system or a process [Wikipedia].

Functional safety Part of the overall safety relating to the EUC (Equipment under Control) and the EUC control system which depends on the correct functioning of the E/E/PE (Electrical/Electronic/Programmable Electronic) safety-related systems, other technology safety-related systems and external risk reduction facilities [point 3.1.9 in IEC 61508].

Functionality A set of attributes that bear on the existence of a set of functions and their specified properties. The functions are those that satisfy stated or implied needs [ISO 9126]. Not defined in IEEE 610.12!

Hardware Physical equipment used to process, store, or transmit computer programs or data. Contrasts with: software [IEEE 610.12]. Comment: Hardware may have design faults (permanently), fabrication faults (initially), and disintegration faults (eventually).

Hazard Def.1) A state or set of conditions that, together with other conditions in the environment, will lead to an accident (loss event). Note that a hazard is not equal to a failure [Leveson95]. Comment: “will” should rather be “may”? Def.2) A software condition that is a prerequisite to an accident [IEEE 1228].

Hazard level A combination of severity (worst potential damage in case of an accident) and likelihood of occurrence of the hazard [Leveson95].

Hazop HAZard and OPerability analysis is a systematic method for examining complex facilities or processes to find actual or potentially hazardous procedures and operations so that they may be eliminated or mitigated [Wikipedia].

Incident An event that involved no loss (or only minor loss) but with the potential for loss under different circumstances [Leveson95].

Integrity Absence of improper system alterations [Avizienis2004].

Maintainability Ability to undergo modifications and repairs [Avizienis2004].

Performance The degree to which a system or component accomplishes its designated functions within given constraints, such as speed, accuracy, or memory usage [IEEE 610.12].

Portability A set of attributes that bear on the ability of the software to be transferred from one environment to another [ISO 9126]. Not defined in IEEE 610.12!

Project stakeholder Anyone who is a direct user, indirect user, manager of users, senior manager, operations staff member, support (help desk) staff member, tester, developer working on other systems that integrate or interact with the one under development, or maintenance professional potentially affected by the development and / or deployment of a software project [Ambler01].

Quality Def.1) 1) The degree to which a system, component or process meets specified requirements. 2) The degree to which a system, component or process meets customer or user needs or expectations [IEEE 610.12]. Def.2) The totality of features and characteristics of a product or service that bears on its ability to satisfy stated or implied needs [ISO 8402] (now being withdrawn). Comment: [ISO 9126] specifies six main “quality characteristics”: functionality, reliability, usability, efficiency, maintainability, and portability – and totally 21 subcharacteristics. In short, quality means a satisfied user or customer.

Quality of Service (QoS) Def.1) In telephony, QoS can simply be defined as “user satisfaction with the service” [ITU-T E.800]. Def.2) "A set of quality requirements on the collective behavior of one or more objects" [ITU standard X.902]. Comment: That is, the behavioral properties of a service must be acceptable (of high enough quality) for the user, which can be another system, an end-user, or a social organization. Such properties encompass technical aspects like dependability (i.e. trustworthiness), security, and timely performance (transfer rate, delay, jitter, and loss), as well as human-social aspects (from perceived multimedia reception to sales, billing, and service handling). NB: not defined in IEEE 610.12! See popular paper on QoS [Helvik03] where the more subjective term QoE (Quality of Experience) is introduced, and also [Zekro99]. But how to measure such a complex property?

Reliability Def.1) The characteristic of an item expressed by the probability that it will perform its required function in the specified manner over a given time period and under specified or assumed conditions. Reliability is not a guarantee of safety [Leveson95]. Def.2) Continuity for correct service [Avizienis2004]. Def.3) The ability of a system or component to perform its required functions under stated conditions for a specified period of time [IEEE 610.12]. Def.4) A set of attributes that bear on the capability of software to maintain its level of performance under stated conditions for a stated period of time [ISO 9126]. Comment: Such attributes may be measured by Mean-Time-Between-Failures (MTBF) or probability of non-failure being 1 – Pr(failure).

Requirement 1) A condition or capability needed by a user to solve a problem or achieve an objective. 2) A condition or capability that must be met or possessed by a system or system component to satisfy a contract, standard, specification, or other formally imposed documents. 3) A documented representation of a condition or capability as in 1) or 2) [IEEE 610.12]. Comment: often used in plural form, and described in a document called requirements specifications. It has two parts: functional requirements and non-functional requirements. The latter are often called quality requirements and specify the desired ambition level for quality attributes like reliability and safety. When discussing whether a suspected fault, error or failure is genuine, i.e., formally incorrect, we must always relate to explicitly stated functional requirements – which however may be ambiguous, incomplete, or inconsistent. J

Risk Def.1) A function of 1) the likelihood of a hazard occurring, 2) the likelihood of the hazard leading to an accident (including duration and exposure), and 3) the severity of consequences of the accident [Leveson95]. Def.2) A measure that combines both the likelihood that a system hazard will cause an accident and the severity of that accident [IEEE 1228]. Comment: How to measure a risk – by the most probable capitalized loss? But certain accidents cannot be quantified, just ask the insurance companies! NB: risk is not defined in IEEE 610.12.

- Operational risk Is primarily a technical responsibility and focus on requirements stability, design performance, code complexity, and test specifications. Operational risks deals with intermediate and final work product characteristics. Because software requirements are often perceived as flexible, the software operational risk is difficult to manage [Hall98].

- Process risk Deals with management and technical work procedures. Management procedures contain activities such as planning, staffing, tracking, quality assurance, and configuration management. Technical procedures contain requirements analysis, design, code and testing [Hall98].

- Product risk Is primarily a technical responsibility and focus on requirements stability, design performance, code complexity, and test specifications. Product risks deals with intermediate and final work product characteristics. Because software requirements are often perceived as flexible, the software product risk is difficult to manage [Hall98].

- Project risk Is primarily a management responsibility and defines operational, organizational, and contractual software development parameters. Project risk includes resource constraints, external interfaces, supplier relationships, and contract restrictions [Hall98].

- Supply risk The potential occurrence of an incident associated with inbound supply from individual supplier failures or the supply market, in which the outcomes result in the inability of the purchasing firm to meet customer demand or cause threats to customer life and safety [Zsidisin03].

- Tolerable risk How willingly we are to live with a risk to secure certain benefits in the confidence that the risk is one that is worth taking and that it is being properly controlled. Risk which is accepted in a given context based on the current values of society [IEC 61508].

Robustness The degree to which a system or component can function correctly in the presence of invalid inputs or stressful environmental conditions [IEEE 610.12].

Safety Def.1) Freedom from unacceptable risk of physical injury or of damage to the health of people, either directly or indirectly as a result of damage to property or to the environment [IEC 61508]. Def.2) Freedom from software hazards [IEEE 1228]. Comment: The higher the risk, the lower the confidence in safety. Safety is always evaluated against real events, not what some requirements may have specified. But on what scale is safety measured – e.g. by the maximum number of human deaths per 10 million airplane flights or by the expected number of fatalities per billion person-kilometers in a city metro? NB: safety is not defined in IEEE 610.12 or ISO 9126!

Safety function Def.1) A function to be implemented by an E/E/PE (Electrical/Electronic/Programmable Electronic) safety-related system, other technology safety-related system or external risk reduction facilities, which is intended to achieve or maintain a safe state for the EUC (Equipment Under Control), in respect of a specific hazardous event (see 3.4.1) [point 3.5.1 in IEC 61508]. Comment: IEC 61508 does not specify how to meet the design requirements for safety functions, meaning that hazard elimination is not within the scope of IEC 61508 [taken from http://www.cs.york.ac.uk/hise/safety-critical-archive/1999/0115.html]. Def.2) A function that reduces the probability for a hazard to arise and lead to accident or incident. Can be done by hazard elimination (substitution, simplification, decoupling, elimination of specific human errors, reduction of hazardous situations), or by hazard reduction (design for control, barriers, failure minimization) or by hazard control and damage minimization (hazard detection (warnings and actions) that transfers the software into a safe state as soon as possible) [Own definition].