A New Taxonomy for Parallel Java Models Applied on Cluster Computing

Mohamed Galal*, Hesham Eldeeb**, Salwa Nassar***

*NationalCenter for Examination and Educational Evaluation (NCEEE)

**, *** Electronic Research Institute (ERI)

EGYPT

Abstract: -The efficient Java parallel programming models that make it easier for parallel programmers to synchronize and exchange data between processors in network of PCs cluster with different levels of parallelism, are needed. A cluster of PCs provides a scalable parallel hardware platform for high performance computing. Moreover, Parallel and distributed computing on a cluster of PC's is being increasingly applied to a variety of large size computational problems. A huge computational infrastructure can be presented by building a Java programming with its advantages on cluster computing which is characterized by its scalability, flexibility, and improvement of cost and time.

In this paper, we propose a new taxonomy that is considered useful for describing the different models of parallelism that can be used for writing a parallel Java programs. Firstly, we present the taxonomy models including usage, advantages, and disadvantages of each model. Secondly, we provide a set of criteria to be used as a reference for evaluation and comparison of each model.

In addition, we describe the design and the implementation of three parallel models and extract their features to make it easy for the parallel programs developers to choose appropriate Java model for there application.

Key-Words: - Java, Parallel programming, Distributed computation, Taxonomy, Cluster computing, JavaSpaces,

RMI, Java sockets.

1

1 Introduction

Computational power of high-performance computing systems is required to solve many scientific and engineering problems. The complex parallel and distributed applications take advantage of the increased raw computational power provided by the cooperation of a multitude of connected computers with today's networks. The crucial point in building such applications is to establish the necessary communication structures between the participating processes allowing them to synchronize and exchange data, hide the hardware differences from the designer, and simple in use.

Cluster computing is an efficient platform for running heavy computation applications [1]. A cluster computing offers several important advantages. Some of these advantages are better price/performance ratios, scaling to very large number of nodes, flexibility of configuration and upgrading.

The Java platform bases on the power of networks and the idea that the same software should run on many different kinds of computers. Since its initial commercial release in 1995, Java technology has grown in popularity and usage because of its portability. Designers of Java describe it as a simple, distributed, interpreted, secure, architecturally neutral, portable, high-performance, multithreaded, and dynamic language. Some of these adjectives also have been used to describe other languages, but Java is unique in some adjectives. It is the first language that can be use for writing general-purpose programs, as well as programs designed for use on internets and intranets. In addition, it is completely object-oriented language and its code run without any change in any platform.

The main goal of this paper is to provide a new taxonomy for parallel Java models that will be used to build a parallel, distributed applications using Java, and to make a comparative study between them. Next section presents the new taxonomy levels, followed by an overview of each level of this taxonomy. Two more sections for: the implementation of the three models and extract features to help parallel programs developers to choose appropriate Java model for parallelism. The last section is the discussion and conclusion.

2 The new taxonomy of parallel Java models

Multithreading in Java language utilizes the multiprocessor system and multithreading shares the CPU time in case of single processor system. Java automatically distributed threads in multithreading Java code over CPUs with no programming control in case of cluster computing. We will not study here in this paper multithreading in Java. We want to study the different Java communication structures that used as good tools for building parallel and distributed applications, given the rising importance of the network. We provide a taxonomy that makes it easier for parallel programmers to select the suitable Java model to synchronize and exchange data between processors in network and distributed environment.

The standard Java libraries offer network communications that are UDP, TCP/IP, and a higher-level communication method RMI. In addition, there is a new technology for network world called Jini, which has been built on the top of Java language, object serialization, and RMI. Jini enable objects to move around the network from virtual machine to virtual machine and it extends the benefits of Java language to the network.

Building parallel and distributed applications need special programming models that explicitly handle the synchronization and data exchanging. The classical programming languages are based on extensions to the sequential programming paradigm, thus they are not very well suited to handle these problems. To exploit the potential of java, there are three levels of parallel programming with Java language as shown in figure 1.

Fig.1 A new taxonomy for parallel Java models.

These levels and its names are selected to be a new taxonomy for parallel programming in Java and these levels are:

  1. Low programming level using Java network tools model.
  2. Basic programming level using Java RMI model.
  3. High programming level using Jini & JavaSpaces model.

3 Low level programming using Java network tools

Networks are always designed as a series of protocol layers, with each layer responsible for some aspect of the network operation.

3.1 TCP/IP & UDP protocol

The TCP/IP is well known protocol. It includes a rich set of application services. TCP, UDP, and IP provide infrastructure for application services.

By comparing TCP and UDP, it is clear that UDP is connectionless transport protocol because there are no connections needed to be established prior to sending data in contrast to TCP. UDP is less reliable than TCP but, it is faster, and simpler [2]. Depending on the application, the appropriate protocol is selected.

3.2 Java sockets programming

Java creates a new programming mechanism called sockets. A Java application is free to open a socket connection to any host it wants, open any file, and create its own custom class loaders. Actually, there are two kinds of sockets: connection-oriented sockets, and connectionless sockets.

3.2.1 Connection-oriented programming

In cluster computing applications over TCP, a client program and a server program establish to one another. Each program binds a socket to its end of the connection. The Socket and ServerSocket classes provided in Java.net package implement the client side and server side of the connection, respectively [3].

In fact, the Java classes have significantly reduced the skill needed to create a sockets program. Each transmission mode is implemented in a separate set of Java classes. Once the socket is created and listening for connections, incoming connections are created and placed on the listen stack.

3.2.2 Connectionless programming

Connectionless programming and UDP protocol provide a mode of network communication whereby applications send packets of data, called datagrams, to one another. The application sends and receives DatagramPackets through a DatagramSocket [4].

In addition, we can use connectionless programming using UDP connection for multicasting. Most high-level network protocols only provide a unicast transmission, that is, nodes of the network only have the ability to send to one other node at a time.

4 Basic level programming using Java RMI model

The problems with sockets essentially stem from the fact that they are very primitive; it is not part of object model, and point-to-point connection. RMI is a software layer on top of sockets.

The purpose of RMI is to provide an abstraction that enables an application designer to invoke methods on remote objects instead of having to communicate at a lower level. To transfer or persistence objects, the RMI API uses the serialization API to wrap and unwrap the objects [4].

4.1 Serialization

Serialization is a mechanism for converting Java objects to a stream of bytes that is platform independent. When a version or type is unknown, the deserializer can use the byte code loader to load the correct class file for into the running application. Serialization and its performance are of critical importance for parallel programming.

4.2 RMI Advantages

At the most basic level, RMI is Java Remote Procedure Call RPC mechanism. The primary advantages of RMI over RPC and sockets are Object-oriented, Mobile behavior, Distributed garbage collection, Concurrent computing, and Distributed computing solution.

4.3 RMI Architecture for cluster of PC's

The RMI architecture is based on one important principle: the definition of behavior and the implementation of that behavior are separate concepts.

RMI allows the code that defines the behavior and the code that implements the behavior to remain separate and to run on separate Java virtual machines.

The RMI implementation is essentially built from three abstraction layers. The first is the Stub and Skeleton layer that which intercepts method calls made by the client to the interface reference variable and redirects these calls to a remote RMI service. The next layer is the Remote Reference Layer that understands how to interpret and manage references made from clients to the remote service objects [5].

The transport layer is based on TCP/IP connections between machines in a network that is the third layer. There is no need to design an application level protocol. RMI takes care of low-level communication details.

5 High level programming using Jini & JavaSpaces for cluster of PC's

Jini extends RMI to provide services to a network on the cluster. Jini provides new registry called Jini Lookup Service. It has some advantages comparing with RMI: entries are leased not bound, Jini Lookup service is discovered, and search by interface versus by name [6]. The Most important Jini service of inter-est to cluster computing environment is JavaSpace.

5.1 JavaSpaces services

JavaSpaces are object repositories where any platform architecture can share objects. JavaSpaces technology provides a programming model that views applications as a collection of processes cooperating via the flow of objects into and out of one or more spaces. A space is a shared, network accessible repository for objects [7]. It provides associative lookup of persistent objects and maintains security through transactions.

Using space-based implementation allows transacting executable content across the network. This decouples the semantics of distributed computing from the semantics of the problem domain. A loose coupling in this case is means that the two elements can be managed and developed independently. This eliminates issues around multithreaded server implementation, low-level synchronization, or network communication protocols.

5.2 Application Model and Terms

JavaSpaces service holds entries. An entry is a typed group of objects, expressed in a class for the Java platform. The following methods are the essential of the JavaSpaces API and are supplied by any object that implements the JavaSpace interface. These are write, read, and take. In addition, the JavaSpace interface supplies two other methods notify and snapshot which can be useful in many applications [7].

5.3 JavaSpaces Advantages

In JavaSpaces applications, processes do not communicate directly, but instead coordinate their activities by exchanging objects through a space [8].

It leverages features that come with Java objects such as strong typing, mobile code, and secure execution. JavaSpaces provided a space with the many advantages like shared, persistent, associative, transactionally secure, and it allows exchange of executable content.

6 Implementation of the different taxonomy levels on cluster of PCs

Matrix multiplication is a building block in many matrix operations and for solving many problems like graph theory problems, neural networks system, and so on. Many papers provide some insights as to why this problem is complex. Practical algorithms that use matrix multiplication tend to use different shaped matrices, and use parallel implementation of matrix-matrix multiplication can significantly impact the performance of matrix multiplication.

In particular, we consider in this paper the problem of developing a code to compute C = A.B, where A, B, and C are dense matrices of size NxN.

This matrix-matrix multiplication involves O(N3) operations, since for each element cij of C , we must compute form (1) and the data distributed in tasks that a row of A and B allocated to a single task. The task n generates result n for the C matrix.

We design programs that will allow distributing the parallel implementation of multiplication of A, B, and C using three models: Java sockets, Java RMI, and Jini & JavaSpaces. We fixed the number of processor to five processors. One machine as master that create and distribute tasks over four machines as workers that are perform task computation and return the result back to the master. Changing is only in the size (N) of the matrices.

6.1 Implementation the low-level program using java sockets model

In this paper, we choose to use TCP for its reliability, multiplexing, and connections. Moreover, implemen-tation of programs using TCP protocol is more suitable to compare with the other implementation of RMI and JavaSpaces. To communicate over TCP, a worker program and a master program establish to one another. Each program binds a socket to its end of the connection. The Socket and ServerSocket classes provided in Java.net package implement the worker side and master side of the connection, respectively.

Master program gets number of workers and N from user, then generates tasks with random integers for both matrices A and B. After that, it creates a ServerSocket and it waits workers to connect on this ServerSocket. Once a worker program connects to the ServerSocket, master program creates a thread to deal with this worker by start sending a task and wait for a result from the worker, while it still listening for the other workers to connect. There is a shared object local in the master program to hold the collected outputs from each thread to form matrix C.

On the other hand, the worker program try to connect the master ServerSocket then, it starts synchronize with the thread of master to receive the task, converts data types, start computing the task, and return the result to the thread in master program.

Design notes

  • Synchronization between master and worker, if worker start before master or worker start while master generate matrix and not start listen to the port the connection is fail.
  • A lot of work done to convert data from type to another because the socket can not transfer all types of data.
  • After every process of sending or receiving between master and worker, the sender waits a receiver to replay before skip to the second step.
  • The code written to do this job is very low level, manage everything and the worker is dedicated to do a special task only.
  • All detail about the workers, master location and port numbers must known a priori.

The sockets cannot transfer objects directly. This is at odds with our use of an object oriented programming language. Distributed objects are a potentially powerful tool that has only become broadly available for developers at large in the past few years. The power of distributing objects is not in the fact that the object is only a complex data type, but it is executable contents also. From this point, we can design a new tool for parallel and distributed programming that called compute server (it is not a new term, but we redesign it with Java RMI and JavaSpaces).

Compute Server

Historically, compute servers have been used to take advantage of resources of large farms of processors in order to tackle computationally intensive problems like ray tracing, weather modeling, cryptography, and so called "grand challenge". Now we design general purpose compute servers in both RMI and JavaSpaces that allow us to compute any task, add new tasks at will, and the server automatically incorporates their corresponding code.

6.2 Implementation the basic level using java RMI model to design a compute server

We design a general compute server using RMI that accepts tasks from master program, executes these tasks, and returns results. RMI users usually use a remote object on a server called RMI server. This server executes a method invocated by any client on the server throw the RMI mechanism. This style will reverse in our design; every client (worker) will run as compute server that wait to invoke a local method called runTask() remotely by the master. A Task object that defines the interface between the compute server and the work that it needs to do, providing the way to start the work. Implementation of the Task object is downloaded by RMI into the compute server virtual machine when needed.