DbViZ – Group 1

Brian Sidharta – bsidhart CS329

Ross Paul – rosspaul HW 2

Aleksandra Faust – afaust 2/20/03

Saurabh-Gandotra – gandotra

1 - Discrepancy from average of lines of non commenting source statements (NCSS) per class and per method

A

The metrics (1) calculates number of NCSS for each method and class in the project. It calculates the average number of statements per method and class in the project and displays the difference from the average. The results are more meaningful than plain lines of code since its basis comes from the project itself and is insensitive to the coding style, language, amount of comments etc. If we detect a large method using this method, the method is large for the project, not large based on some statistical method size.

This metrics is easy to gather, and it will be used for a quick detection of very large and very small methods and classes. Detecting unusually large methods can point us to a complex code that needs our attention and refactoring. Similarly, unusually small methods can point us to either methods that are not needed, or are not yet implemented.

This metrics is important because of its simplicity and ease of use. It is an efficient method for finding potential trouble makes and potential omissions. The more sophisticated metrics can be applied to the suspects found by this metrics.

Additionally, keeping track of average number of statements per method and class can give us progress on the project.

B

We used JavaNCSS ( in conjunction with XSL to gather this metrics. The tool allows us to save the results in XML format. We created a XSL sheet that displays the difference from the average (DA) and sort the methods and classes in the descending order by DA. The result was easy to analyze.

To calculate the number of non commenting source statements the tool counts all: package, import, class, interface, field, method, and constructor declarations, constructor invocation, statements, and labels. Empty statements, blocks and comments are not counted. For more detail see:

C

The analysis of the results showed that average class size is 50.82 NCSS, with vast majority falling into 20-100 NCSS range. There is a single class standing out though: edu.uiuc.jdbv.diagram.TableView with 220 NCSS, almost double the second largest class.

On the other side of the range there is edu.uiuc.jdbv.imports.ImportDefinition with mere 5 statements. Less than half of the second runner. This indicated that this class might not be completed yet.

Looking at number of statements per method, the average number of NCSS is 5.85, close to one tenth of statements per method. The striking is though the number of one statement methods. There are 66 such methods, and 204 methods with less than average NCSS out of 252 methods in the project. This is normal and it is expected. More interesting is the other side of the spectrum, method edu.uiuc.jdbv.imports.sql.SqlImportDefinition.importSchema() with 69 statements, almost 15 times the size of an average method. This leads us to believe that this a complex method that probably needs refactring. The truth is that this method contains a prototype of the of SQL parser and it is due to be replaced with more structured and cleaner code in the next release.

Since the tool does Cyclomatic Complexity metrics on the method level, it was interesting to observe that the results of NCSS roughly matched those of CCN. It would be interesting to compare the totals CCN for the class with NCSS results.

Number of methods per class

A

This metrics calculates number of methods the class in the project, and compares the number with average number of methods per class in the project. The goal is to discover large and potentially hard to understand classes.

This metrics is easy to gather, and it will be used for a quick detection of very large and very small classes. Detecting classes with unusually large number of methods can point us to a complex code and a big response. Similarly, unusually small classes can point us to design flows, poorly thought through class.

Similarly to the above, this metrics is important because of its simplicity and ease of use. It is an efficient method for finding potential trouble makes and potential omissions. The more sophisticated metrics can be applied to the suspects found by this metrics.

B

We used JavaNCSS ( in conjunction with XSL to gather this metrics. The tool allows us to save the results in XML format. We created a XSL sheet that displays the difference from the average (DA) and sort the methods and classes in the descending order by DA. The result was easy to analyze.

To calculate the number methods per class the tool counts all method declarations with in a class. Both public and private methods are counted.

C

The analysis of the results showed that average number of methods per class is 7.2 and that number of methods are pretty evenly distributed. The only class sticking out is edu.uiuc.jdbv.diagram.TableRender with 18 methods. There were 6 classes with only 2 methods, that might need to be taken closer look at, if they are a true class or not.

3 & 4 - Cyclomatic Complexity per Method and Class

A

Cyclomatic Complexity is a simple and telling metric concerning the psychological complexity in a method or class. Effectively it is a measure of the number of different branches that reside within a given method or class. For a method, the complexity is computed by totaling the number of conditionals and adding one (one is added indicating the initial path the method would perform without any conditionals). Class Cyclomatic Complexity is simply a total of the method complexities within the class.

Tests for complexity will be run at the beginning of each iteration. Cyclomatic complexity is most useful for identifying code which could benefit from refactoring. By reworking and simplifying the complex methods and classes, maintenance of the system will be easier and learning the system far less confusing.

B

Class and method Cyclomatic Complexity can be measured using Panorama for Java ( Despite it’s horrendous user interface, Panorama takes very detailed and useful metrics which can easily be rerun once set up. Panorama functions by analyzing the code (it also has features to analyze compiled classes, but these options were not explored due to the deplorable HCI). It then displays the complexities in various ways: tables, bar graphs, reported form etc. It also calculates averages, standard deviations and other statistical findings.

C

Class Cyclomatic Complexity Average: 10.85

While most classes fell within the range of 5-20, the results picked out those classes which were unreasonably high. Namely, the TableRenderer at 27 and surprisingly an inner class of TableView called SizeHandle at 47. TableRenderer is a gui component and most of the branches are simple choices regarding color or display state, therefore it is reasonable for the complexity to be so high. While also a gui class, SizeHandle should be refactored and broken down. At the very least it should be exonerated from inner class status and made a full fledged class.

Method Cyclomatic Complexity Average: 1.64

The average was not a surprising or in anyway revealing, however Panorama was able to locate the functions which are overly complex. Specifically three functions had a complexity over 10: TableView$SizeHandle.computeBounds(), TableView$SizeHandle.mouseDragged(), and SqlImportDefinition.importSchema(). The two TableView methods obviously cause the high complexity of the class. This problem was identified by Class Cyclomatic Complexity. The interesting finding comes in importSchema. This method is responsible for converting a sql file into an internal representation of a relational database. The complexity is caused by layers and layers of conditionals mapping all possible paths the sql statement could take. This can and will be changed to a more object oriented approach where sql statements are broken down into their components and the components are converted individually.

5 & 6 Efferent and Afferent Couplings

A

Efferent and Afferent Coupling metrics will be calculated on the compiled Java class files. Efferent Couplings is the list of packages that the classes in a certain package depend upon. Afferent Couplings are the packages that depend on classes within a certain package. These two metrics describe how independent a package is and how likely changes in that package will affect other packages.

We will record these metrics at the end of each development iteration on all packages of the system. The resulting metrics are compared with the metrics that team members estimate when considering the design architecture. These metrics are useful because they reveal unexpected package dependencies within the system which appeared during design and implementation. This will assist in optimizing the design and architecture of the system.

B

The Efferent and Afferent Couplings are measured using JDepend ( JDepend analyzes all of the compiled Java class files to compile metrics on the relationships between the packages. It does this by examining the files and recording all instances of method calls between the classes. These results can then be displayed in a GUI. It lists each package in the system along with the Efferent and Afferent Couplings, in addition to several other metrics.

C

The initial run of JDepend on the current software revealed some unexpected dependencies. There are six packages in the system:

  • jdbv – the main application starting point. As the central starting point for the application, jdbv depends on most of the other packages. This is expected and considered acceptable. As for its afferent couplings, it’s used only by gui and imports, which is also expected and acceptable.
  • diagram – diagram modeling classes. Diagram depends only on the schema package, excluding external packages. This is expected and quite acceptable, as it allows the general diagram classes to be used in a different application. It is used by jdbv and gui which is expected.
  • gui – general application GUI classes. Like jdbv, as the classes in gui apply to the entire application, it depends on all the other packages. Some thought should be put into combining gui and jdbv, since they share many similar couplings. Additionally, the couplings in gui are frequently cyclic; we will consider how we can break some of these cycles.
  • schema – general classes that model a database schema. It has no efferent couplings, which makes it very independent, which is by design. Its afferent dependencies are quite numerous, which is not unexpected; reducing this number would be desirable, but may not be practical since this is a very important package in the system.
  • imports – pluggable schema-import framework classes, it defines abstract classes for schema-import operations and concrete classes for using them. Imports has dependencies on jdbv, gui and schema. The former two are by design, but undesirable. The import framework should not depend on jdbv and gui, as it is a general schema-impoort framework and should not be tied so closely to this particular application. Its afferent dependencies are few; this is expected and satisfactory.
  • imports.sql – classes for importing from SQL text files, which define concrete implementations of classes in the imports package. It depends on imports and schema, which is as designed and satisfactory. JDepend lists its afferent couplings as zero, which is not correct. This is because JDepend analyzes the class files as compiled, which do not explicitly refer to the classes in this package. This package is loaded at runtime through a configuration file. This “incorrect” result is desirable, since it confirms that this package really is “pluggable”—it can be removed from the system with no ill-effects, aside from the unavailability of the behavior that it implements.

7 - Lines of Code / # of Confirmed System Defects :

A

One of the key measurements that can be done for a software is it's 'size'. There are various measures that can be done to calculate the size of the product ( software ) and one of the most often used is the Lines of Code ( LOC).

This metric will be used in conjunction with # of defects to calculate the Defect Density Metric which can be defined as

# of defects discovered in Code

LOC

We will ignore the following:

  • blank lines;
  • lines consisting only of comments;
  • lines consisting only of opening and closing braces.

This metric is easy to gather can will be used frequently to gauge the quality of the software. By ignoring in the above mentioned, some deviations will be removed and comparisons between at various points in the SDLC will give an idea to the progress of the development effort.

B

The tool we will be using for this metric is 'UNDERSTAND FOR JAVA' (UFJ) which is essentially a metric tool for Java Source Code. UFJ is essentially like a compiler in a way that it needs the source code and the path to the .jar files.

As explained in PART A we will be counting the # of lines of code using UFJ. Also,

We will be using BugZilla for maintaining Bug Stat Reports which will be incorporated in the Metric to calculate the Defect Density Metric. For more information:

C

The metric calculation would commence after we move from the development into testing phase.

Issues:

Though Defect density Metrics is one of the most common metrics used to gauge the 'quality of software'. It has # of issues.

a. Defect density Metric may not show the quality of the software but the severity of testing.

b. Different organizations have different ways of counting the LOC, which makes it difficult to 'compare the quality' of different software.

8 - # of Versions of File:

A

This # of versions of a file can be used to measure the change of requirements (the volatility) of the software. The measure of volatility is the % of requirements changed in a given period of time and can be computed as number of requirements in the changes to the document to the base count of documents and can be assessed by looking through the various versions of file.

B

We will be using CVS as a Configuration Management Tool for tracking various versions .

For more information:

C

As we are still in the first iteration for most of the files (documents and code) are in their first version. This metric would be calculated as we more through iterations.