MeasuringLevels of Abstraction in Software Development
Frank Tsui, Abdolrashid Gharaat, Sheryl Duggins, Edward Jung
School of Computing and Software Engineering
Southern PolytechnicStateUniversity
Marietta, Georgia, USA
Abstract – In software engineering and development, we are expected to utilize the technique of abstraction. Yet, this is one of the most confounding topics, and there is a dearth of guideline. In this paper we explore and enhance the concepts of abstraction as applied to software engineering, define and discuss a metric called levels-of-abstraction, LOA, and discuss the potential of utilizing this metric in software engineering.
Keyword: software, abstraction, levels of abstractions
I. Introduction
In developing software from requirements we are often faced with a “first step” syndrome of where should one start. Many high level architectural styles and patterns [1] have helped overcome this initial hurdle. However, we are still faced with further analysis of the requirements to somehow group similar requirements together along functional line or along some data usage line into sub-components. The question that faces many developersduring this early stage of requirements analysis,design and code is the decision of whatshould be the granularity level of these major components. Stated in another way, the question is what should be the appropriate level of abstraction forspecifying, for designing and for implementingthe functional requirements.In this paper we first explore the general notion of abstraction and enhance the concept to include levels-of-abstraction as it applies to software engineering.Next we propose a general metric for levels-of-abstraction, LOA, which allows us to gauge the amount of abstraction. Finally, we show some characteristics of LOA and how one may decide on what is the appropriate level of abstraction when developing software.This is a report on the current status of our research in this area. This research is showing some promise in the area of explaining and formulating guidelines for the amount of abstraction and the depth of abstraction required in performing different software engineering activities.
II. General Concepts Related toAbstractions in Software Engineering
One of the fundamental reasons for engaging in the task of abstraction in software analysis, design and development is to reduce the complexity to a certain level so that the “relevant” aspects of the requirements, design and development may be easily articulated and understood. This starts with the requirements definition through the actual code implementation. In the three major activities of requirements definition, of design definition, and of implementation code definition, there are different degrees of abstraction based on what is considered relevant and necessary. The general relationships of the individualworld domain, the abstractionsof those domain entities, and the artifacts specifying those abstractions are shown in Figure 1.The bold, vertical arrows represent the intra-transformations occurring within each individual world domain of requirements, design, and implementation. The horizontal arrows represent the inter-transformations occurring across those domains. In software engineering, we are concerned with both the horizontal and the vertical transformations.
In this paper we focus on the vertical intra-transformations, especially the transformations from the individual world domains to the abstractions.
The termabstraction used in the form of a verb, as represented with bold vertical arrows in Figure 1 from individual world domain to abstractions,would include the notion of simplification. Simplification represents the concept of categorizing and grouping domain entities into components and relating those components. We simplify by:
(i)reduction and
(ii)generalization.
By reduction, we mean the elimination of the details. Generalization, on the other hand, is the identificationand specification of common and important characteristics. Through these two specific subtasks of reduction and generalization we carry out the task of abstraction. Rugaber [7] states that design abstraction is a “unit of design vocabulary that subsumes more detailed information.” This notion of abstraction via simplification is also similar to the notion explicated by Wagner and Deissenboeck[9] and Kramer [3]. Thus via the process of abstraction, we aim to simplify or decrease the complexity of the domain of software design solution by reducing the details and by generalization.One may view the well established conceptof modularization in software engineering as a specific technique within the broader context of abstraction.
The employment of abstraction in software engineering where we aim to reduce complexity is certainly not just limited to the domain of software design. As Figure 1 shows, many other domains also employ abstraction. At the early stage of software development, the requirements represent the needs and wants of the users and customers. The user requirement is represented in some form of the user business flow, user functional needs, user information flow, user information representation, user information interface, etc. Each of these categories is an abstraction of the user world. Different degrees of abstraction may be employed [3, 4, 8, 11]depending on the amount of details that need to be portrayed in the requirements. This intra-transformation is represented by the vertical arrow from User Needs/Wants domain to Requirements Models of Abstraction in Figure 1.
As we move from requirements towards the development of the solution for the user requirements, a new form of abstraction takes place. The new form of abstraction is necessitated due to the fact that the solution world includes the computing machines and other logical constructs that may not exist in the original user requirements. One of the firstartifacts from the solution side is the software architecture and high level design of the software system. At this point, the abstraction, once again, should not include all the details. The inter-transformation of the requirements models of abstraction to the design models of abstraction is shown as the horizontal arrow in Figure 1. Note that the box labeled “Design Models of Abstraction” is the result of two types of transformations:
a)inter-transformation from the requirements domain and
b)intra-transformation within the design solution domain.
The “Implementation Models of Abstraction” box in Figure 1 represents the packaging/loading models of theprocesses, information and control of the mechanical execution of the solution in the form of source code to satisfy the functionalities and the system attributes described in the requirements and design documents. Thus it is a result of the inter-transformations from requirements models of abstraction through the design models of abstractions and the intra-transformations from the actual software execution domain.
The “Implementation Code” box is the specification of this abstraction. The actual execution of the Implementation Code and the interactions with the users formulate the final deployment of the source code abstraction or the “Executing Software System” box in Figure 1. Thus the employment of various abstractions and the transformations of these abstract entities are crucial, integral parts of software engineering.
III. Measuring Abstraction
Abstraction, both as a verb and as a noun, is a crucial element in software engineering. As a verb, we have defined it as the activity of simplification, composed of reduction of details and the generalization of crucial and common attributes. Now, it is relevant to ask how much abstraction would be appropriate so that we can arrive at the “Implementation Code” box and the “Executing Software System” box in Figure 1.
Jackson[2] admonishes us that we need to be careful with abstraction and the degree of abstraction because so many seemingly good designs fall apart at implementation time. His warning is well founded in that many design abstractions, the noun, are often missing some vital information for the detail coding activities. In the past, we have utilized the technique of decomposition to move from abstraction to details. However, if our abstraction is generalizing too much to not include the vital information, then Jackson’s warning will turn into reality.The levels of abstraction should bedifferent for various software artifacts and be dictated by the purpose of abstraction. Wang[10] has expressed a similar concern and defined a Hierarchical Abstraction Model for software engineering; his hierarchical model of abstraction describes the necessary levels of preciseness in representing abstractions of different objects. He argues for more precise, rigorous and formal ways to describe different software artifacts such as software architecture and design. In terms of our Figure 1, Wang addressed the issue of rigor of specifications of abstraction in the requirement and design documents, not how much should be included in the abstraction.
The how much, or the amount, of abstraction is a reflection of the result of the simplification activity, which is in turn composed of reduction and generalization activities. Measuring the amount of abstraction is gauging the extent of reduction and generalization that took place. For example, this may be possible in the requirements domain. We may consider the set of the original requirements statements of needs and wants as X in the “User Needs/Wants” box in Figure 1. Then, |X|, the cardinality of X is a count of the raw requirement statements collected through some solicitation process. These are the pre-analysis requirements statements. We then designate Y as the statements in the “Requirements Models of Abstraction” box.The cardinality of Y, |Y|, is a count of the statements that resulted from requirements analysis, which include activities such as organizing, grouping, prioritizing, etc. In other words, the post-analysis of the solicited requirements is a form of abstraction of the raw user needs and wants requirement statements. Then the“difference” between |X| and |Y| is:
(Level-of-Abstraction)REQ. = |X| - |Y|.
(Level-of-Abstraction)REQ represents the “difference” between pre-analysis and post analysis of requirements, and it may be considered a metric of abstraction for requirements.
Since simplification is a vital characteristic of abstraction, we expect |Y| to be less than |X|. Thus we will need to further refine this definition with the constraint that if |Y| is not less than |X|, then no abstraction activity really took place. We will also take the subscript, REQ, off the terminology for the general case. In general, let X be the statements in the domain world, and let Y be the set of statements in the abstraction, the noun, then
Level-of-Abstraction (LOA) = |X| - |Y|,
if |X| > |Y|
else
= 0
Note that when the abstraction activity is carried to its extreme, |Y| should just be 1. For example, all the raw requirement statements of needs and wants are abstracted into one abstract statement. Thus Level-of-Abstraction is bounded by (|X| - 1) and 0.
This also implies that Level-of-Abstraction is heavily influenced by |X|. Consider two domain world representations, X1 and X2, where |X1| > |X2|, with their respective abstractions of Y1 and Y2. If |Y1| = |Y2|, then |X1|-|Y1| > |X2|-|Y2|. Thus Level-of-Abstraction states that more abstraction, the verb, took place in the X1 to Y1 scenario than the X2 toY2 case.
A more interesting consideration is the situation where we have two different abstractions, Y1 and Y2, from the same world domain, X. Assume |Y1| < |Y2|, then (|X| - |Y1|) > (|X| - |Y2|). Thus, as we would expect, Level-of-Abstraction from X to Y1 is more than that of X to Y2.
While we are not certain of the real interval size between one Level-of-Abstraction and its immediate next level, we can say that Level-of-Abstraction is an ordinal metric that provides us a sense of ordering.
IV. Measuring Requirements Abstraction:
In this section we will further explore the Level-of-Abstraction measurement concept, using requirements analysis as an example. Note that it is very likely that X and Y are not expressed with the same language. English sentences and some diagrams may be the main ingredients of the wants and needs expressed by the users and customers. The result of requirements prioritization, categorization and analysis is some form of abstraction, Y, which may be expressed with a Use Case Diagram. The amount of requirements abstraction defined as the “difference” between pre and post analysis of requirements deserves some further explanation and illustration here.
Suppose there are X = {x1, x2, ---, xz} raw requirement statements. The set X may contain a variety of statements, referring to functionality, data and other attributes. A common type of abstraction that may be employed is to categorize and group X by functionality. Thus only a subset of X, X’, is addressed. X’ is the subset of requirement statements that addresses functionality needs. Let X’ = {xx1, xx2, ---, xxn}, where |X’| ≤ |X|. The subset X’ is analyzed and partitioned into some set of categories of functionalities. Let us call this partitioned set,set Y. Y may look as follows.
Y = {(xx1, xx2); (xx3, xx5, xx10); ----}
Every functionality xxi € X’ is in one of the partitions of Y and in only one of the partitions. Renaming the partitioned set Y as follows, yields:
Y = {y1, y2, ----, yk}
where
y1 = (xx1, xx2)
y2 = (xx3, xx5, xx10)
.
.
Yk.
The set Y may be represented by a Use Case diagram where y1, y2, ---, yk are the named interaction represented as “bubbles” in the Use Case diagram. Clearly there is more than one way to partition X’; thus there may be different Y’s. A use case diagram with one “bubble” would be an extreme case as well as a use case diagram with a bubble for each xxi. The extreme points of |Y| = 1 nor |Y| = |X’| would be very rare.
Now consider a specific case where the domain set X has 4 functional requirements x1, x2, x3, x4. Then there are the following partitioning sets, P’s for different functional abstractions, Y’s.
P0 has Y01 = {(x1); (x2); (x3); (x4)} = X
P1 has Y11 = {(x1); (x2); (x3,x4)},
Y12 = {(x1); (x3); (x2,x4)},
Y13 = {(x1); (x4); (x2,x3)},
Y14 = {(x2); (x3); (x1,x4)},
Y15 = {(x2); (x4); (x1,x3)} and
Y16 = {(x3); (x4); (x1,x2)}
P2 has Y21 = {(x1); (x2,x3,x4)},
Y22 = {(x2); (x1,x3,x4)},
Y23 = {(x3); (x1,x2,x4)}and
Y24 = {(x4); (x1,x2,x3)}
P3 has Y31= {(x1,x2); (x3,x4)},
Y32= {(x1,x3); (x2,x4)}, and
Y33= {(x1,x4); (x2,x3)}
P4 has Y41 = {(x1,x2,x3,x4)}
At P0, there is only one abstraction, Y01, which is the same as the original requirement set X. So Level-of-Abstraction is |X| - |Y| = 0. There is no abstraction at P0. At P1, any of the Y1x has a cardinality of 3. So at P1, |X| - |Y| = 4 – 3 = 1. The Level of Abstraction is 1. At P2, the Y2x’s are grouped differently and each has a cardinality of 2. Thus, at P2, |X|-|Y| = 4-2 = 2. P3 partitioning has the functionalities grouped differently, but each Y3x has a cardinality of 2, just like those in P2. At P3, |X| - |Y| = 4 -2 = 2. Thus all partitions in P2 and in P3 are at the same Level-of-Abstraction, 2. Finally at P4, there is again only oneabstraction, Y41, which combined all four functionalities into 1 category. Thus at P4, |X| - |Y| = 4 – 1 = 3. The Level-of- Abstraction is the highest here.
From this example, one can easily see that abstraction of functionalities into groups for a relatively small set of four functional requirements has many choices. In this case there are 15 choices and there are 4 different Level-of-Abstraction, namely 0 through 3. Coming up with the one “best” abstraction is not an easy task even with this small example.
V. General Theorems
We have shown that for a |Y| = k, there may be several different abstractions with that same cardinality. That is, given an abstraction level j, there may be more than one solution.
Theorem 1:For Level-of-Abstraction = |X| - |Y| = j, there exists more than one abstraction or partitioned set, Y, at that Level-of-Abstraction, unless j = 0 or
j = |X|-1.
Proof:Given |x| - |y| = j, if j=0 then |x| - |y| = 0 and there is no abstraction. If |x| - 1 = j, then |y| has only 1 category with all functionalities included, so |x| - |y| = |x| - 1, which is the highest level of abstraction. From combinatorics, we know for any set with n>0 elements and r an integer such that 0 ≤ r ≤ n, then the number of subsets that contain r elements of S is
n!__. Hence for all the values in between, there
r!(n-r)!
will be partitioned sets.
Note that the different Level-of-Abstraction’s as shown in our simple example of P0 through P4 form a partially ordered set. Next we introduce an Up and a Down operator on Level-of-Abstraction. Given a Level-of-Abstraction, Px, and a Level-of-Abstraction, Py, any x€Px and y€Py, then UP and Down operators are defined a follows:
UP(x,y) = x if Py ≤ Px and
= y otherwise.
Down(x,y) = y if Py ≤ Px and
= x otherwise.
Theorem 2: Set of Level-of-Abstractions, with the operators of Up and Down form a lattice structure.
Proof: The set of Level-of-Abstraction is a partially ordered set, or a Poset. A lattice is a Poset in which any two elements have a lowest upper bound (lub) and a greatest lower bound (glb). The Up operator provides us the glb and the Down operator provides us the lub. Thus the set of Level-of Abstraction forms a lattice.
VI. Summary and Results
In this paper we explored the general notion of abstraction as applied to software engineering. We further proposed a general metric for levels-of-abstraction, LOA, which allows us to gauge the amount of abstraction. Lastly, we showed some characteristics of LOA and indicated how one may decide on the appropriate level of abstraction when developing software.
This paper indicates our preliminary work in this area. Our next task is to further explore the properties of our Up and Down operators. For example, we believe for the UP operator, the glb is the identity element.With it, the set of LOA forms a Group. With the Down operator, the identity element is the lub. Thus with both Up and Down and their respective identity elements, the set of LOA may be a Ring. We further will explore the fact that when you go Up and Down in LOA, how do you know which gives the best coverage of abstraction? Our initial idea is that the top level of abstraction may not be the best in terms of design.
References
- D. Garlan and M. Shaw, “An Introduction to Software Architecture,” CMU-CS-94-166, CarnegieMellonUniversity, Pittsburgh, PA, 1994.
- D. Jackson, Software Abstractions Logic, Language, and Analysis, MIT Press, 2006.
- J. Kramer, “Is Abstraction the Key to Computing,” Communications of the ACM, Vol. 50, No 4, April 2007, pp 37-42.
- J. Kramer and O. Hazzan, “Introduction to The Role of Abstraction in Software Engineering,” International Workshop on Role of Abstraction in Software Engineering,” Shanghai, China, May, 2006.
- S. Morasca, “On the Use of Weighted Sums in the Definition of Measures,” Workshop on Emerging Trend in Software Metrics, Cape Town, S. Africa, May, 2010.
- D. E. Perry, “Large Abstractions for Software Engineering,” 2nd International Workshop on Role of Abstraction in Software Engineering, Leipzig, Germany, May, 2008.
- S. Rugaber, “Cataloging Design Abstractions,” International Workshop on Role of Abstraction in Software Engineering,” Shanghai, China, May, 2006.
- F. Tsui and O. Karam, Essentials of Software Engineering, 2nd edition, Jones and Bartlett Publishers, 2010.
- S. Wagner and F. Deissenboeck, “Abstractness, Specificity, and Complexity in Software Design,” 2nd International Workshop on Role of Abstraction in Software Engineering, Leipzig, Germany, May, 2008.
- Y. Wang, “A Hierarchical Abstraction Model for Software Engineering,” 2nd International Workshop on Role of Abstraction in Software Engineering, Leipzig, Germany, May, 2008.
- F. Tsui, A. Gharaat, S. Duggins and E. Jung, “Measuring Levels of Abstraction in Software Development”, Internal Research Report, Software Engineering, Southern PolytechnicStateUniversity, July 2010.