Usage of Weibull and other models for software faults prediction in AXE

Lovre Hribar

Ericsson Nikola Tesla, d.d.Croatia, Split

E-mail: ,

Abstract: There are several families for software quality prediction techniques in development projects. All of them can be classified in several subfamilies. Each of these techniqueshas its own distinctive feature and it may not give correctprediction of quality for a scenario different from the onefor which the technique was designed. All these techniques for software quality prediction are dispersed. One of them is Statistical and Probabilistic technique. The paper deals with software quality prediction techniques in development projects. Four different models based on statistical and probabilistic approachis presented and evaluated for prediction of software faults in very large development projects.

1. INTRODUCTION

It is always about quality and cost. Or even better, it is always about ratio between quality and cost. Software quality prediction helps minimize software costs. The number of faults in a large software project has a significant impact on project performances and hence is an input to project planning [1], [2]. As the quality level of the final product is set at the beginning of the project, a large number of faults can result in project delays and cost overruns. It identifies risk pronesoftware components and indicates software problemsat earlier stages of development [3], [4]. Thisrisk can be mitigated at initial stages and better effortestimates can be made [5]. At the same time this earlyhandling of risks reduces the costs associated withthe risks. This eventually will producegood quality software in terms of number of errors. Inaddition the software quality prediction has a significantrole in easing the maintenance of software. Inshort, the prediction is helpful in software development,testing and maintenance activities.

Predictions can either be made on the basis ofhistorical data collected during implementation of sameor similar projects [6], [7], or it can be made using thedesign metrics collected during design phase. Bothare measurement-based prediction methods since theyinvolve software metrics in the prediction process.Based on these metrics, it is possible to build predictionmodels which are either of statistical nature [8] or Artificial Intelligence (AI) based models [7], [9],[10], and [11]. This prediction will be of empirical nature and a givenstatistical model will do the job. With the help ofsuch a prediction an organization can at least roughlypredict quality of the next iteration. Based on thesepredictions, new goals can be set and better quality maybe achieved. Initial workon software quality prediction was of statistical naturefor example [12], [13].

The basic aim of software quality prediction is tofacilitate development of better quality software [16], [17]. Betterquality results in satisfied customer and healthierreturn on investment. Various organizations and publicdepartments have been involved in studies related toquality prediction for example Commission of theEuropean Communities’ Strategic program for Researchin Information Technology [14], Swedish NationalBoard for Industrial and Technical Development [4], Northern Telecom Limited, USA [5], Nortel,USA [15], National Sciences and Engineering ResearchCouncil of Canada [18], [19], National ScienceFoundation, USA [20], [21], NASA [3], and NationalNatural Science Foundation of China [9]. Importance of quality prediction cannot be deniedin current era of information technology [21]. Instead,in these days organizations are becoming more qualityconscious and customer oriented.

This paper explore thedepth and breadth of software quality prediction techniques based on statistical and probabilistic approachand identify future research towards a method to selecta most suitable prediction technique for projects developed for AXE based on the four different modelsdiscussed in following sections.

2. WEIBULL DISTRIBUTION MODEL

The Weibull distribution is by far the world’s most popular statistical model for life data. It is also used in many other applications, such as weather forecasting and fitting data of all kinds. Among all statistical techniques it may be employed for engineering analysis with smaller sample sizes than any other method. The Weibull Distribution was first published in 1939, over 60 years ago and has proven to be invaluable for life data analysis in aerospace, automotive, electric power, nuclear power, medical, dental, electronics, and every industry. The Weibull distribution is one of the most widely used lifetime distributions in reliability engineering.

Where,

η = scale parameter, β = shape parameter (or slope) and
γ = location parameter.

It is a versatile distribution that can take on the characteristics of other types of distributions, based on the value of the shape parameter, β.

2.1 The Effect of β on the Weibull Failure Rate Function

The value of β has a marked effect on the failure rate of the Weibull distribution and inferences can be drawn about a population's failure characteristics just by considering whether the value of β is less than, equal to, or greater than one. Populations with β < 1 exhibit a failure rate that decreases with time, populations with β = 1 have a constant failure rate and populations with β > 1 have a failure rate that increases with time. All three life stages of the bathtub curve can be modeled with the Weibull distribution and varying values of β. The Weibull failure rate for 0 < β < 1 is unbounded at T = 0 (or γ). The failure rate, λ(T), decreases thereafter monotonically and is convex, approaching the value of zero.

This behavior makes it suitable for representing the failure rate of units exhibiting early-type failures, for which the failure rate decreases with age.

When encountering such behavior in a manufactured product, it may be indicative of problems in the production process, inadequate burn-in, substandard parts and components, or problems with packaging and shipping.

For β = 1, λ(T) yields a constant value and this makes it suitable for representing the failure rate of chance-type failures and the useful life period failure rate of units.

For β > 1, λ(T) increases as T increases and becomes suitable for representing the failure rate of units exhibiting wear-out type failures. For 1 < β < 2, the λ(T) curve is concave, consequently the failure rate increases at a decreasing rate as T increases.

For β = 2 there emerges a straight line relationship between λ(T) and T, starting at a value of λ(T) = 0 at T = γ,and increasing thereafter.Note that at β = 2, the Weibull distribution equations reduce to that of the Rayleigh distribution.

When β > 2, the λ(T) curve is convex, with its slope increasing with T. Consequently, the failure rate increases at an increasing rate as T increases indicating wear-out life.

2.2 The Effect of the Scale Parameter, η, for the Weibull distribution

A change in the scale parameter η has the same effect on the distribution as a change of the abscissa scale. Increasing the value of η while holding β constant has the effect of stretching out the pdf. Since the area under a pdf curve is a constant value of one, the "peak" of the pdf curve will also decrease with the increase of η.

If η is increased while β and γ are kept the same, the distribution gets stretched out to the right and its height decreases, while maintaining its shape and location.

If η is decreased while β and γ are kept the same, the distribution gets pushed in towards the left (i.e. towards its beginning or towards 0 or γ), and its height increases (η has the same units as T, such as hours, miles, cycles, actuations, etc.)

2.3 The Effect of the Location Parameter, γ,for the Weibull distribution

The location parameter, γ, as the name implies, locates the distribution along the abscissa. Changing the value of γ has the effect of "sliding" the distribution and its associated function either to the right (if γ > 0) or to the left (if γ < 0).

When γ = 0, the distribution starts at T = 0 or at the origin.

If γ > 0, the distribution starts at the location γ to the right of the origin.

If γ < 0, the distribution starts at the location γ to the left of the origin (γ provides an estimate of the earliest time-to-failure of such units.)

The life period 0 to +γ are a failure free operating period of such units.The parameter γ may assume all values and provides an estimate of the earliest time a failure may be observed. A negative γ may indicate that failures have occurred prior to the beginning of the test, namely during production, in storage, in transit, during checkout prior to the start of a mission, or prior to actual use (γ has the same units as T, such as hours, miles, cycles, actuations, etc.)

3. LOGNORMAL DISTRIBUTION MODEL

The lognormal distribution is commonly used to model the lives of units whose failure modes are of a fatigue-stress nature.

Where:

Since this includes most, if not all, mechanical systems, the lognormal distribution can have widespread application. Consequently, the lognormal distribution is a good companion to the Weibull distribution when attempting to model these types of units.

3.1 Characteristics of the lognormal distribution

As may be surmised by the name, the lognormal distribution has certain similarities to the normal distribution. A random variable is lognormal distributed if the logarithm of the random variable is normally distributed. Because of this, there are many mathematical similarities between the two distributions.

The lognormal distribution is a distribution skewed to the right. The pdf starts at zero, increases to its mode, and decreases thereafter. The degree of skew ness increases as increases, for a given. For the same, the pdf's skew ness increases as increases.

For values significantly greater than 1, the pdf rises very sharply in the beginning, i.e. for very small values of T near zero, and essentially follows the ordinate axis, peaks out early, and then decreases sharply like an exponential pdf or a Weibull pdf with 0 < β < 1.

The parameter, in terms of the logarithm of the T's is also the scale parameter, and not the location parameter as in the case of the normal pdf.The parameter, or the standard deviation of the T's in terms of their logarithm or of their, is also the shape parameter and not the scale parameter, as in the normal pdf, and assumes only positive values.

4. NORMAL DISTRIBUTION MODEL

The normal distribution, also known as the Gaussian distribution, is the most widely-used general purpose distribution. It is for this reason that it is included among the lifetime distributions commonly used for reliability and life data analysis. There are some who argue that the normal distribution is inappropriate for modeling lifetime data because the left-hand limit of the distribution extends to negative infinity. This could conceivably result in modeling negative times-to-failure. However, provided that the distribution in question has a relatively high mean and a relatively small standard deviation, the issue of negative failure times should not present itself as a problem.

Nevertheless, the normal distribution has been shown to be useful for modeling the lifetimes of consumable items. The pdf of the normal distribution is given by:

Where:

It is a two-parameter distribution with parameters μ (or) and σT, i.e. the mean and the standard deviation, respectively.

4.1 Characteristics of the Normal distribution

The normal pdf has no shape parameter. This means that the normal pdf has only one shape, the bell shape, and this shape does not change.

The standard deviation, σT, is the scale parameter of the normal pdf. As σT decreases, the pdf gets pushed toward the mean, or it becomes narrower and taller. As σT increases, the pdf spreads out away from the mean, or it becomes broader and shallower.

The standard deviation can assume values of 0 < σT.The standard deviation is also the distance between the mean and the point of inflection of the pdf, on each side of the mean. The point of inflection is that point of the pdf where the slope changes its value from a decreasing to an increasing one, or where the second derivative of the pdf has a value of zero.The normal pdf starts at T = - with an f(T) = 0. As T increases, f(T) also increases, goes through its point of inflection and reaches its maximum value at T = .

Thereafter, f(T) decreases, goes through its point of inflection, and assumes a value of f(T) = 0 at T = + .

5. EXPONENTIAL DISTRIBUTION MODEL

The exponential distribution is a commonly used distribution in reliability engineering. Mathematically, it is a fairly simple distribution, which many times lead to its use in inappropriate situations. It is, in fact, a special case of the Weibull distribution where β - 1.

5.1 Characteristics of the Exponential distribution

The exponential distribution is used to model the behavior of units that have a constant failure rate (or units that do not degrade with time or wear out). As mentioned before, the primary trait of the exponential distribution is that it is used for modeling the behavior of items with a constant failure rate. It has a fairly simple mathematical form, which makes it fairly easy to manipulate. Unfortunately, this fact also leads to the use of this model in situations where it is not appropriate. However, some inexperienced practitioners of reliability engineering and life data analysis will overlook this fact, lured by the siren-call of the exponential distribution's relatively simple mathematical models.

The exponential pdf has no shape parameter, as it has only one shape.The exponential pdf is always convex and is stretched to the right as λ decreases in value.

The value of the pdf function is always equal to the value of λ at T = 0 (or T = γ).

The location parameter, γ, if positive, shifts the beginning of the distribution by a distance of γ to the right of the origin, signifying that the chance failures start to occur only after γ hours of operation, and cannot occur before this time.

The scale parameter is.As,.

The one-parameter exponential reliability function starts at the value of 100% at T = 0, decreases thereafter monotonically and is convex.

The two-parameter exponential reliability function remains at the value of 100% for T = 0 up to T = γ, and decreases thereafter monotonically and is convex. As,. The reliability for mission durationof, or of one MTTF duration, is always equal to 0.3679 or 36.79%.

This means that the reliability for a mission which is as long as one MTTF is relatively low and is not recommended because only 36.8% of the missions will be completed successfully. In other words, of the equipment undertaking such a mission, only 36.8% will survive their mission.

6. MEASUREMENT RESULTS

A fault occurrence as a customer reported problem that requires developer intervention to correct is defined. This is the observable event of interest for both maintenance and insurance purposes. The operational definition of a fault occurrence varies across organizations. In this paper, we use the approach to analyze fault occurrences in Ericsson Nikola Tesla R&D organizations using the term Trouble Reports (TR) and the TR inflow over the weeks. The TRs are reports either from the customer, either from the internal verification team, either from ongoing development projects.In general, not every reported TR reports the fault. After analysis some reports are classified as a non-fault but as a misconfiguration or misunderstanding the function. The software development process atETK R&D organizationconstantly during all development stages in the project measure faults and failures. The goals is to find fault as much as possible earlier and to fixed them and to save hours and money for the organization internally and Ericsson in global. Our findings support the idea that a common fault occurrence projection method for ETK R&D canbe used across many R&D organizations either inside the Ericsson Company, either outside Ericsson orfor similar or even different development styles.

Figure 1. Weibull model

Figure 2. Lognormal model

We are interested in the fault occurrence pattern, which is the rate of fault occurrence as a function of time over the lifetime of a product or several product lines.

Software system functioning in IP network that businesses are increasingly dependent upon was examined. This system is multirelease, in customer operation over 3 years, single-platform, and widely-deployed.

It is generally accepted that these widely-deployed, production, software systems are not fault free and that there is a need to manage the risks associated with the fault occurrences. The models are simulated in SCILAB [22] and the parameters for the different distributions are calculated in Microsoft Excel [23].

Empirically are addressed two questions that are important for fault occurrence projection:

  • Is there a type of fault occurrence model that provides a good fit to fault occurrence patterns across multiple releases and in many organizations?
  • Given such a model, how can model parameters for a new release be extrapolated using historical information?

Figure 3. Normal model

The measurement results provide a basis for a fault occurrence projection method that is robust across ETK R&D organization and development styles. Data are used from a diverse sample of program releases developed with single development styles (commercial) inside the AXE platform. Data are gathered from 201 weeks in operation.

The results show that the Weibull model is the preferred one among 4 different models due to the fact that the correlation coefficient was the highest value among all models. Also, exponential model is inadequate for prediction of software faults.

Figure 4. Exponential model

7. FUTURE WORK

Proposed models should be developed further on to find out on the larger sample of the software units the best suitable models for the software faults prediction. Not only the software units should be evaluate, but also the applications in both mobile and fixed networks. Models should also evaluate and be combined with other quality methods taking the right preventive/corrective actions within the projects.

8. CONCLUSION

The results presented in this paper can help in future planning the projects taking into the considerations the faults currently existing in the systems.Presented and evaluated models can help in prediction in number of the faults in the system during his lifetime and consequently help improve the quality of the product taking the right preventive actions.