Chapter 5

The Results: Analytical and Empirical

This chapter discusses the results of the analytical validation as well the empirical validation.

5.1. The Analytical Validation

As mentioned in Chapter 4, Weyuker’s ‘nine properties to evaluate software complexity metrics’ are used to analytically validate the four metrics, which are the focus of this research. The nine properties, which according to Weyuker have to be satisfied by every metric, are discussed in detail in Chapter 4. In this section we discuss the analytical validation of the metrics: Interaction Level, Interface Size, Operation Argument Complexity, and Attribute Complexity using these set of properties. Of these nine properties, we will briefly address three of them here, before presenting the detailed analytical validation using the remaining six properties.

Weyuker’s seventh property “Permutation Changes Complexity” requires that permutation of elements within the item being measured change the metric value. The intent is to ensure that the possibility exists for metric values to change due to permutation of program statements. This property is meaningful in traditional program design, where the ordering of if-then-blocks could alter the program logic (and consequent complexity). In OO systems at the design level, a class is an abstraction of the problem space, and the order of statements within the class definition has no impact on eventual execution or use. For example, changing the order in which methods are declared does not affect the order in which they are executed, since methods are triggered by the receipt of different messages from other objects. Cherniavsky & Smith, and Chidamber & Kemerer also argue that this property is not appropriate for object-oriented design metrics [Cherniavsky and Smith, 1991; Chidamber and Kemerer, 1994]. Therefore this property is not considered further in the validation process.

Weyuker’s eighth property, requires that when the name of the measured entity changes, the metric should remain unchanged. As none of the four metrics being considered in this research, depends on the name of the methods, attributes, or the class, they all satisfy this property. Since this property is being met by all the metrics, this property is not considered further in the validation process.

Weyuker’s ninth property, when two classes are combined, the interaction between classes can increase the metric value. We have argued in Chapter 4, that this property need not necessarily be true in the context of object-oriented design. We have also empirically seen in our experiment involving Treatment 2 (refer to Appendix H), where combining two similar classes into a single class resulted in a lower metric values for all the four metrics. Chidamber and Kemerer also argue that satisfying this property is not an essential feature for OO design complexity metrics [Chidamber and Kemerer, 1994]. Chidamber and Kemerer further say “…experienced OO designers … found that memory management and run-time detection of errors are both more difficult when there are a large number of classes to deal with. In other words, their viewpoint was that complexity can increase when classes are divided into more classes.” Therefore this property is not considered further in the validation process.

The following are some basic assumptions made for the validation (similar to the assumptions made by [Chidamber and Kemerer, 1994]):

Assumption 1: Let

Mc = the number of methods in a given class c.

Pm = the number of parameters in a given method m.

Ac = the number of data attributes in a given class c.

Mc, Pm, and Ac are discrete random variables, each characterized by some general distribution function. Further, all the Mc’s, Pm’s, and Ac’s are independent and identically distributed (IID), which suggests that the number of methods, parameters, and attributes follow a statistical distribution that is not apparent to an observer of the system. Further, the observer cannot predict the Mc, Pm, and Ac of one class / method based on the knowledge of the Mc, Pm, and Ac of another class.

Assumption 2: In general, two classes can have a finite number of “identical” methods in the sense that a combination of the two classes into one class would result in one class’s version of the identical methods becoming redundant. For example, in Treatment 2 of our empirical study, one version has two classes, TRACTOR and TRAILER (refer to Appendix F). Each of these classes has methods gettaxes() and getweight(). In the second version when the two classes were combined into a single class, we have just one gettaxes() and one getweight(), of course modified to reflect the new abstraction (refer to Appendix E).

Metric: Interaction Level

Let

ILP = Interaction Level for class P

ILQ = Interaction Level for class Q

ILR = Interaction Level for class R

ILP and ILQ are functions of the number of methods, the number and type of method parameters, and the data attributes. Since the number of methods, the number of method parameters, and data attributes are IID random variables, and since functions of IID random variables are also IID, it follows that ILP and ILQ are IID. Therefore, there is a non-zero probability that  P and  Q for classes P and Q such that ILP ILQ. Therefore Property 1 (Non-coarseness) is satisfied.

Similarly, there is a non-zero probability that  P and  Q such that ILP = ILQ, and therefore Properties 2 (Granularity) and 3 (Non-uniqueness) are satisfied. Property 3 is satisfied because the value of IL really does not depend on what P, or Q is doing, but only on what the design of P or Q is. So it is possible that P and Q are doing totally different things, but have the same interaction level.

For a given system, the choice of the number of classes, number of methods in each class, the number and type of parameters in each method, and the number and type of attributes, is a design decision and independent of the functionality of the class. For example in Treatment 1 (implementing a quadrilateral), one version of design results in an Interaction Level of 36, while another version of the design results in an Interaction Level of 70 (refer to Appendix H). Therefore Property 4 (design details are important) is satisfied.

If the two classes P and Q are combined to form one class, the Interaction Level for that new class will depend on whether P and Q have any common methods and/or attributes. If P and Q do not have any common methods and / or attributes, then the Interface Size of the new class, IL(P+Q), is greater than or equal to ILP + ILQ. We have the possibility of IL(P+Q) being greater than ILP + ILQ, because with increased number of attributes and/or methods, we now have potentially new interactions between the attributes from class P interacting with parameters of methods from class Q and vice versa. Therefore Property 5 (Monotonicity) is satisfied in this case. If P and Q have some common methods and/or attributes, then IL(P+Q) = ILP + ILQ -  + , where  is the IL of the common methods and attributes, and  is the increase in IL as a result of the cross interactions between attributes from class P and methods from class Q and vice versa. The maximum value  can have, is the minimum of (ILP, ILQ). The minimum value  can have is 0, if no new interactions are added (if one class is a subset of the other class). Therefore, we can clearly see that, ILP IL(P+Q) and ILQ IL(P+Q), and therefore Property 5 is satisfied even in this case. Therefore this metric satisfies Property 5.

Now let us assume that P and Q are two different classes but with the same Interaction Level ILPQ (possible from Property 3). Extending the argument from the above paragraph, if classes P and Q, respectively, are combined with another class R, then we have IL(P+R) = ILP + ILR -  +, and IL(Q+R) = ILQ + ILR - +. Since, we know that , , , and  are independent variables, ( - ) and ( - ) need not necessarily be equal. Therefore Property 6 (non-equivalence of interaction) is satisfied by this metric.

Metric: Interface Size

Let

ISP = Interface Size for class P

ISQ = Interface Size for class Q

ISR = Interface Size for class R

ISP and ISQ are functions of the number of methods, and the number and type of method parameters. Since the number of methods and the number of method parameters is an IID random variable, and since functions of IID random variables are also IID, it follows that ISP and ISQ are IID. Therefore, there is a non-zero probability that  P and  Q for classes P and Q such that ISP ISQ. Therefore, Property 1 (Non-coarseness) is satisfied.

Similarly, there is a non-zero probability that  P and  Q such that ISP = ISQ, and therefore Property 2 is satisfied. The Interface Size of a class really does not depend on what P, or Q is doing, but only on what the design of the methods of P or Q is. So it is possible that P and Q are doing totally different things, but have the same interface size, and therefore Property 3 is satisfied.

For a given system, the choice of the number of classes, number of methods in each class, the number and type of parameters in each method is a design decision and independent of the functionality of the class. For example in Treatment 1 (implementing a quadrilateral), one version of design results in an Interface Size of 6, while another version of the design resulted in an Interface Size of 24 (refer to Appendix H). Therefore Property 4 (design details are important) is satisfied for this metric.

If the two classes P and Q are combined to form one class, the Interface Size for that new class will depend on whether P and Q have any common methods. If P and Q do not have any common methods, then the Interface Size of the new class is IS(P+Q) is equal to ISP + ISQ. Therefore Property 5 (Monotonicity) is satisfied in this case. If P and Q have some common methods, then IS(P+Q) = ISP + ISQ - , where  is the IS of the common methods. The maximum value  can have, is the minimum of (ISP, ISQ). Therefore, we can clearly see that, ISP IS(P+Q) and ISQ IS(P+Q), and therefore Property 5 is satisfied even in this case. Therefore this metric satisfies Property 5.

Now let us assume that P and Q are two different classes but with the same Interaction Size ISPQ (possible from Property 3). Extending the argument from the above paragraph, if class P and Q respectively are combined with another class R, then we have IS(P+R) = ISP + ISR - , and IS(Q+R) = ISQ + ISR -. Since, we know that  and  are independent variables, they need not necessarily be equal. Therefore Property 6 (non-equivalence of interaction) is satisfied by this metric.

Metric: Operation Argument Complexity

It is easy to argue about which properties are satisfied using examples. In our experimental study, we had a method “float gettaxes()” whose OAC is 2, and another method whose “int getweight()” whose OAC is 1 (refer to Appendix E). Since we see that OACgettaxes OACgetweight, Propery 1 is satisfied.

We have a method “void setx4(float)” whose OAC is 2 (refer to Appendix C). Thus we see that two methods (gettaxes and setx4) doing totally different things have the same metric value. Thus Properties 2 and 3 are also satisfied.

For a given system, the choice of the number of classes, number of methods in each class, and the number and type of parameters in each method is a design decision and independent of the functionality of the class. For example in Treatment 1 (implementing a quadrilateral, one version of design resulted in an Operation Argument Complexity of 4, while another version of the design resulted in an Operation Argument Complexity of 32 (refer to Appendix H). Therefore Property 4 (design details are important) is satisfied for this metric.

If the two classes P and Q are combined to form one class, the OAC for that new class will depend on whether P and Q have any common methods. If P and Q do not have any common method, then the OAC of the new class is OAC(P+Q) is equal to OACP + OACQ. Therefore Property 5 (Monotonicity) is satisfied in this case. If P and Q have some common methods, then OAC(P+Q) = OACP + OACQ - , where  is the OAC of the common methods. The maximum value  can have, is the minimum of (OACP, OACQ). Thus, we can clearly see that, OACP OAC(P+Q) and OACQ OAC(P+Q), and therefore Property 5 is satisfied even in this case. Therefore this metric satisfies, the Property 5.

Now let us assume that P and Q are two different classes but with the same Operation Argument Complexity OACPQ (possible from Property 3). Extending the argument from the above paragraph, if class P and Q, respectively, are combined with another class R, then we have OAC(P+R) = OACP + OACR - , and OAC(Q+R) = OACQ + OACR -,. Since, we know that  and  are independent variables, they need not necessarily be equal. Therefore Property 6 (non-equivalence of interaction) is satisfied by this metric.

Metric: Attribute Complexity

It is easy to argue about which properties are satisfied using examples. In our experimental study, we had a class called “TractorTrailer” whose AC is 8, and another class “Tractor” whose AC is 4 (refer to Appendixes E and F). Since we see that ACTractorTrailer ACTractor, Propery 1 is satisfied.

If we consider a class “line” (representation of a straight line), the attributes of that class could be “float x1, y1, x2, y2 (x1,y1 representing one end of the line, and x2,y2 representing the other end of the line). The AC of this class “line” would be 8. Thus we see that two classes (TractorTrailer, and line) doing totally different things have the same metric value. Thus the Properties 2 and 3 are also satisfied.

For a given system, the choice of the number of classes, the number and type of data attributes is a design decision and independent of the functionality of the class. For example in Treatment 1 (implementing a quadrilateral), one version of design resulted in an Attribute Complexity of 16, while another version of the design resulted in an Attribute Complexity of 20 (refer to Appendix H). Therefore Property 4 (design details are important) is satisfied for this metric.

If the two classes P and Q are combined to form one class, the Attribute Complexity for that new class will depend on whether P and Q have any common data attributes. If P and Q do not have any common attributes, then the attribute complexity of the new class is AC(P+Q) is equal to ACP + ACQ. Therefore Property 5 (Monotonicity) is satisfied in this case. If P and Q have some common attributes, then AC(P+Q) = ACP + ACQ - , where  is AC of the common attributes. The maximum value  can have, is the minimum of (ACP, ACQ). Therefore, we can clearly see that, ACP AC(P+Q) and ACQ AC(P+Q), and therefore Property 5 is satisfied even in this case. Therefore this metric satisfies Property 5.

Now let us assume that P and Q are two different classes but with the same Attribute Complexity ACPQ (possible from Property 3). Extending the argument from the above paragraph, if class P and Q respectively are combined with another class R, then we have AC(P+R) = ACP + ACR - , and AC(Q+R) = ACQ + ACR -,. Since, we know that  and  are independent variables, they need not necessarily be equal. Therefore Property 6 (non-equivalence of interaction) is satisfied by this metric.

5.2 Summary of Analytical Validation

We can summarize the analytical validation process by saying that all the four OO design complexity metrics: Interaction Level, Interface Size, Operation Argument Compexity, and Attribute Complexity, satisfy the relevant properties as specified by Weyuker. Weyuker’s seventh and ninth properties are not considered for OO design metrics.

5.3 The Experimental Task

This section describes the experimental procedure. As described in the previous chapter, a controlled laboratory experiment is used to validate the metrics. The laboratory experiment is the preferred mode of research, because, this gives us better control to manipulate the variables of interest and control other variables which are not of main interest in the study. In a field study, it would be difficult to obtain such control.

This study investigates the impact of system design complexity on the time to perform a given maintenance task. The system design complexity variable is measured by Interaction Level (IL), Interface Size (IS), Operation Argument Complexity (OAC) and Attribute Complexity (AC). These metrics are discussed in detail in Chapter 3. For the purpose of the experiment, each complexity variable is set at two levels: Low Complexity and High Complexity.

The subjects participating in this research consist of students taking the “Advanced Object-Oriented Programming” undergraduate course at a University. The pre-requisites of this course include that the students must have successfully taken at least the “Introduction to OO Programming” course. Each subject is given two independent treatments.

The first treatment involves a system called “Quadrilateral” (refer to Appendices A through C). The subjects are required to perform a perfective maintenance task on this system. This task involves adding new functionality to the system, such as computing the area and perimeter of the quadrilateral. Two versions of this system are designed, a low complexity version and a high complexity version. Subjects from some sections are assigned to work on the low complexity version, and subjects from some other sections are assigned to work on the high complexity version.