Virtualization in Software Engineering

Dennis Kalinowski

Software Engineering

University of Wisconsin – Platteville

Abstract

In software engineering, abstraction is used to hide technical detail through use of encapsulation. Applying these techniques on a computer's resources is known as virtualization. The resulting virtualization can either represent multiple physical resources as a single logical resource or represent a single physical resource as multiple logical resources.

Virtualization can be applied to a wide array of resources such as servers, operating systems, applications, and storage devices. As a result, there are many applications for virtualization technology. The discussion that follows addresses its application in the design of more secure, efficient, and reliable software.

Introduction

Virtualization is the application of abstraction to hide the technical detail of a computer's resources through encapsulation. These resources can be software (servers, operating systems, applications, etc.) or hardware (storage devices, processors, etc.). The resulting virtualization can either expose multiple physical resources as a single logical resource or expose a single physical resource as multiple logical resources. [1]

The use of virtualization in software allows for improvements in security, performance, and reliability. Focus will be put on virtualizing hardware, operating systems, and processes. Additionally, virtualization can be used as a tool during software development by providing environments that are ideal for development and testing.

Forms of Virtualization

The concepts of virtualization has been applied to solve numerous problems in the computing world. Both hardware and software can utilize the advantages provided by virtualization. The concept itself has been around for decades. In the early 1970s, the IBM 370 mainframe series used virtualization to run multiple operating systems on a single mainframe. In this case, the hardware was virtualized allowing each operating system to operate as if it had exclusive control of the hardware. [2]

This form of virtualization eventually faded away as operating systems supported concurrent tasks. However, in recent years interest in virtualizing computers has renewed. This is evident by the recent growth and innovations in hardware to improve performance of virtualization software. [2]

Not only does software employ the concepts of virtualization, but hardware can be designed with the same concepts. For example, Intel's Hyper-Threading Technology makes a single physical processor appear as multiple logical processors. [3] In yet another hardware application, research is being done to make two physical processors appear as a single logical processor. In such an arrangement, a single thread would have multiple instructions executing simultaneously, resulting in “automatic parallelization”. [4]

Since there are a variety of areas which can be covered, only a few aspects of virtualization will be addressed. The following forms of virtualization will be covered: hardware, operating systems, applications, and processes.

Hardware Virtualization

The description of the IBM 370 mainframe given above is an example of machine virtualization. A Virtual Machine Monitor (VMM) – also known as a hypervisor – allocates system resources for each virtualized operating system or program that runs on top of it. It acts as a layer between the software and the hardware. All attempts to access the hardware are handled by the hypervisor, which can translate or forward actions. [5]

The hypervisor can take two forms: software that sits directly on the hardware or software that runs within an operating system. In the case of the latter, the “host OS” is the operating system that controls the hardware and the “guest OS” is the operating system running on the virtualized hardware. [10] Three types of hardware virtualization are emulation/simulation, native virtualization, and paravirtualization. [1]

In emulation/simulation, the virtual machine simulates complete hardware. Thus, allowing an unmodified operating system for the same or different CPU. [1]

In native virtualization, the virtual machine simulates enough hardware so that the operating system is run in isolation. This only allows unmodified operating systems for the same CPU. [1]

In paravirtualization, the virtual machine provides an API which must be used by the operating system running under it. [1]

Since virtual machines separate the guest OS from the actual hardware, it is possible to alter or observe the system as it executes. Furthermore, the states of the system can be saved and created into images, allowing for duplication.

Virtualized computers are not the only form of hardware virtualization. As discussed earlier, processors could be virtualized into more or less logical processors. Yet another form of hardware virtualization is the abstraction of storage devices as file systems. Rather than having to program with complete knowledge of the specific storage device, file systems can be used as an alternative to direct access to storage. Memory also has been virtualized within operating systems, in the form of virtual memory. [6]

Portability

One of the benefits of hardware abstraction is portability. By hiding the details of hardware from software, the need of reprogramming software for new hardware is eliminated. The only part that requires change would be the interface to the hardware – the virtualization layer.

Security

Another benefit provided by virtualization is the ability to create isolated environments. By containing vulnerable software within its own virtual machine, any intrusion will be restricted to the virtual machine. Thus, shielding the rest of the software running on the computer. [7]

Also, since the operating system executing within the isolated environment is being run by the hypervisor, it is possible for the hypervisor to closely examine the operating system. As such, execution of specific routines can be observed and may be used as indicators of intrusion. At this point, the hypervisor can take appropriate action. [7]

Performance

If a computer's resources are not being fully utilized, virtualization can allow multiple operating systems or processes to share the resource. [8]

As mentioned earlier, the IBM 370 mainframe used a hypervisor to run multiple operating systems. One of the main motivators of this technology was to maintain backward compatibility of old software that ran on older operating systems. These older operating systems may not support threaded environments, so they lacked the efficiency provided by newer operating systems that did support threads. However, by placing multiple operating systems on top of the hypervisor, the processor could be more fully utilized. Thus, efficiency has been improved and more tasks can be executed simultaneously. [2]

Reliability

By using virtualization, reliability can be improved by creating easy recovery and hardware redundancy. Upon failure or catastrophic lost, a previously created image of the system's pristine or last good state can be reloaded. Thus, allowing recovery. [9]

Using these ideas, it is possible to create fault tolerant computers. All critical hardware components can have redundancies that can be switched to in case of the primary's failure. For example, a computer can be designed with a backup CPU. If the primary CPU has a fault, memory can be copied over to the secondary CPU's memory. Then, execution of the program can continue. Meanwhile, the primary CPU can be powered down, repaired, and started again. Once the primary CPU is back up, memory can be copied over to the primary CPU and execution of the program resumed there. [9]

Hardware is not the only application for redundancy in systems. Multiple instances of software running on identical virtual machines can be share loaded. As a result, these additional systems serve to minimize down time. In case of failure of one of the virtual machines, others can still function and be used. As a result, high availability can be achieved. [8]

Software Development

As for software development, virtualization provides many benefits during development and testing of software. It can be used to improve portability of software, to recreate testing scenarios, or offer development environments.

When designing software for platforms or environments different than the developer's machine, virtualization can be used as a tool to build or test the software. Alternatively, if a developer does not want to contaminate his own system, builds and tests can be done within an isolated virtualization environment. These approaches vastly improve portability of software and allows ease of development. [8]

By setting up software within a virtualization environment, configuration can be kept consistent. Also, to minimize software dependencies, the virtualization environment can be configured with a minimal setup. This approach allows developers to practice the KISS (Keep It Simple, Stupid) principle. [8]

Distributed software running multiple instances of software spanned across several machines can be expensive to setup physically. Rather than having multiple machines for such tests, it is possible to put all instances on a single machine. Using virtualization, multiple virtualization environments can reside on a single machine. This greatly reduces costs and allows tight control of the testing environments. [11]

As mentioned before, the systems can be altered or observed during execution – allowing great debugging potential. Also, since the state of the machine can be saved and restored, the exact environment and states necessary to reproduce issues or tests can be made available.

Consider the example of distributed software; server and clients in a networked application. Each virtual machine can be executing either a server or client of the software being developed. One or more machines can be used to stimulate all clients and servers to execute manual or automatic tests. After the desired tests are done, the systems can be reverted to their initial state. [8]

Deployment of software on varying environments or platforms can also be tested through use of virtual machines. For example, if the software has a complex upgrade process, an initial installation can be loaded from an image. Then, the upgrade process can be executed on the system to test the upgrade process. After the test is execute, it can be restored to its initial state. [8]

If the software being developed is an operating system or a kernel, the testing and debugging of the software could be done within a virtual machine. This is ideal, since a real machine does not need to be sacrificed and execution of the system can be monitored and traced. [8]

Of course, virtualization as a development tool can be useful for even less complicated setups. The testing environments can be configured for single applications. To drive testing, automated test suites can be executed to verify functionality of software. [8]

Operating System Virtualization

Operating system virtualization provides virtualization of servers through a layer provided by the operating system. These “containers” separate the server from other processors or resources. So, access can be restricted to specific hardware devices or file systems. There is no isolation from the kernel, so security vulnerabilities of the host OS can be exploited. However, protecting specific resources and other processes may be all the security required for applications of this form of virtualization. [12]

Additionally, this form of virtualization allows multiple instances of the same server to operate in differently configured environments. So, multiple system environments that share the same kernel can be created on the same machine. [12]

Application Virtualization

Application virtualization is the attempt of “breaking the age old bond between physical hardware, operating system, and the program which runs on top of them”. [14] One example of such virtualization are the programming languages known as High-Level Languages (HLL). [6]

Languages such as Java and Microsoft's Common Language Infrastructure (CLI) run on top of HLL Virtual Machines. The intermediate code generated by their compilers get translated into native machine code using “just in time” technology. Thus, the application needs to be only compiled once for all architectures. Only new implementations of the HLL Virtual Machine are needed for new architectures. [6]

Process Virtualization

Another application of virtualization technology is process virtualization. An implementation of such virtualization known as ghost processes allows an easy way to duplicate processes, migrate processes, create process checkpoints, and revert to checkpoints. [13]

Similar to the discussion of virtual machines for software development, this form of virtualization can be used for software testing or debugging. Saving and restoring processes allows easier control of reproducing problems or scenarios.

Additionally, by setting checkpoints, it is possible to improve reliability of a process. If a failure occurs, it can be reverted to a good state. Then, attempts can be made to prevent the failure from reoccurring.

Conclusion

In conclusion, virtualization has many benefits to provide software engineering. It is a form of abstraction used to simplify the use of a computer's resources. Through this abstraction, resources become represted into simpler or different forms. As a result, this new layer can be used to improve software or as a tool.

The benefits of using virtualization include portability, security, performance, and reliability. Portability is greatly enhanced by separating hardware implementation details from software. Only the virtualization layer between the software and hardware requires modifications.

The layer created also provides opportunities for security. Isolated environments can be created to restrict access or the spread of vulnerabilities. Additionally, since the system is being run by the hypervisor, the system can be closely monitored to identify intrusions.

By using virtualization to manage computer resources, resource utilization can improve. To maximize utilization of these resources, the virtualization can allow the resources to be shared across multiple systems or applications.

Reliability of systems and software can be ensured through use of virtualization. Because of virtualization's ability to represent multiple resources as a single resource, redundancy can be provided without the executing software's knowledge. Additionally, previous states of systems can be used in restoration after failures.

Software development receives benefit from using virtualization as a tool. Virtual machines provide ideal environments for varying configurations and hardware. Since images can be made from a virtual machine's state, conditions can be kept consistent and problems can be reproducible. Virtual processes also offer similar capabilities.

References

[1] Virtualization. Retrieved April 2, 2007, from Wikipedia.

Web site:

[2] Figueiredo, R.; Dinda, P.A.; Fortes, J. (2005, May). Guest Editors' Introduction: Resource Virtualization Renaissance. Computer. 38(5), 28-31.

[3] Hyper-threading. Retrieved April 2, 2007, from Wikipedia.

Web site:

[4] Automatic parallelization. Retrieved April 2, 2007, from Wikipedia.

Web site:

[5] Vaughan-Nichols, S.J. (2006, Nov.). New Approach to Virtualization Is a

Lightweight. Computer. 39(11), 12-14.

[6] Smith, J.E.; Ravi Nair. (2005, May). The architecture of virtual machines.

Computer. 38(5), 32-38.

[7] Litty, Lionel. (2005). Hypervisor-based Intrusion Detection. Unpublished master's

thesis. University of Toronto.

[8] Ligneris, B. (2005, May). Virtualization of Linux based computers: the

Linux-VServer project. High Performance Computing Systems and

Applications, 2005. Proceedings. 19th International Symposium. 340-346.

[9] Nakamikawa, T.; Morita, Y.; Yamaguchi, S.; Ishikawa, S.; Miyazaki, Y. (1997, Dec.).

High performance fault tolerant computer and its fault recovery.

Fault-Tolerant Systems, 1997. Proceedings. Pacific Rim International

Symposium. 2-6.

[10] Hypervisor. Retrieved April 2, 2007, from Wikipedia.

Web site:

[11] Seetharaman, S.; Krishna Murthy, B.V.S. (2006, Sept.-Oct.). Test Optimization

Using Software virtualization. IEEE Software. 23(5), 66-69.

[12] Operating system-level virtualization. Retrieved April 2, 2007, from Wikipedia.

Web site:

[13] Vallee, G.; Lottiaux, R.; Margery, D.; Morin, C. (2005, July). Ghost Process: a

Sound Basis to Implement Process Duplication, Migration and

Checkpoint/Restart in Linux Clusters. Parallel and Distributed Computing,

2005. ISPDC 2005. The 4th International Symposium on 04-06 July 2005.

97-104.

[14] Application Virtualization. Retrieved April 2, 2007, from Wikipedia.

Web site:

Kalinowski / 1 / SE411