CLUSTER COMPUTING
A
SEMINAR TOPIC
ON
CLUSTER COMPUTING
BY
K.V.SATYANARAYANA
CLUSTER COMPUTING
Introduction:
A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks. Clusters are usually deployed to improve performance and/or availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.
The High Performance Computing (HPC) allows scientists and engineers to deal with very complex problems using fast computer hardware and specialized software. Since often these problems require hundreds or even thousands of processor hours to complete, an approach, based on the use of supercomputers, has been traditionally adopted. Recent tremendous increase in a speed of PC-type computers opens relatively cheap and scalable solution for HPC, using cluster technologies. The conventional MPP (Massively Parallel Processing) supercomputers are oriented on the very high-end of performance. As a result, they are relatively expensive and require special and also expensive maintenance support. Better understanding of applications and algorithms as well as a significant improvement in the communication network technologies and processors speed led to emerging of new class of systems, called clustersof SMP(symmetric multi processor) or networks of workstations(NOW), which are able to compete in performance with MPPs and have excellent price/performance ratios for special applications types.
A clusteris a group of independent computers working together as a single system to ensure that mission-critical applications and resources are as highly available as possible. The group is managed as a single system, shares a common namespace, and is specifically designed to tolerate component failures, and to support the addition or removal of components in a way that's transparent to users.
What is cluster computing?
Development of new materials and production processes, based on high technologies, requires a solution of increasingly complex computational problems. However, even as computer power, data storage, and communication speed continue to improve exponentially; available computational resources are often failing to keep up with what users’ demand of them. Therefore high-performance computing (HPC) infrastructure becomes a critical resource for research and development as well as for many business applications. Traditionally the HPC applications were oriented on the use of high-end computer systems - so-called "supercomputers". Before considering the amazing progress in this field, some attention should be paid to the classification of existing computer architectures. SISD (Single Instruction stream, Single Data stream) type computers. These are the conventional systems that contain one central processing unit (CPU) and hence can accommodate one instruction stream that is executed serially. Nowadays many large mainframes may have more than one CPU but each of these executes instruction streams that are unrelated. Therefore, such systems still should be regarded as a set of SISD machines acting on different data spaces. Examples of SISD machines are for instance most workstations like those of DEC, IBM, Hewlett-Packard, and Sun Microsystems as well as most personal computers. SIMD (Single Instruction stream, Multiple Data stream) type computers. Such systems often have a large number of processing units that all may execute the same instruction on different data in lock-step. Thus, a single instruction manipulates many data in parallel. Examples of SIMD machines are the CPP DAP Gamma II and the Alenia Quadrics.
Vector processors, a subclass of the SIMD systems. Vector processors act on arrays of similar data rather than on single data items using specially structured CPUs. When data can be manipulated by these vector units, results can be delivered with a rate of one, two and, in special cases, of three per clock cycle (a clock cycle being defined as the basic internal unit of time for the system). So, vector processors execute on their data in an almost parallel way but only when executing in vector mode. In this case they are several times faster than when executing in conventional scalar mode. For practical purposes vector processors are therefore mostly regarded as SIMD machines. Examples of such systems are Cray 1 and Hitachi S3600. MIMD (Multiple Instruction stream, Multiple Data stream) type computers. These machines execute several instruction streams in parallel on different data. The difference with the multi processor SISD machines mentioned above lies in the fact that the instructions and data are related because they represent different parts of the same task to be executed. So, MIMD systems may run many sub-tasks in parallel in order to shorten the time-to-solution for the main task to be executed. There is a large variety of MIMD systems like a four-processor NEC SX-5 and a thousand processor SGI/Cray T3E supercomputers. Besides above mentioned classification, another important distinction between classes of computing systems can be done according to the type of memory access
Shared memory (SM) systemshave multiple CPUs all of which share the same address space. This means that the knowledge of where data is stored is of no concern to the user as there is only one memory accessed by all CPUs on an equal basis. Shared memory systems can be both SIMD and MIMD. Single-CPU vector processors can be regarded as an example of the former, while the multi-CPU models of these machines are examples of the latter.
Distributed memory (DM) systems. In this case each CPU has its own associated memory. The CPUs are connected by some network and may exchange data between their respective memories when required. In contrast to shared memory machines the user must be aware of the location of the data in the local memories and will have to move or distribute these data explicitly when needed. Again, distributed memory systems may be either SIMD or MIMD.
Shared (left) and distributed (right) memory computer architectures
Supercomputersare defined as the fastest, most powerful computers in terms of CPU power and I/O capabilities. Since computer technology is continually evolving, this is always a moving target. This year’s supercomputer may well be next year’s entry level personal computer. In fact, today’s commonly available personal computers deliver performance that easily bests the supercomputers that were available on the market in the 1980’s. Strong limitation for further scalability of vector computers was their shared memory architecture. Therefore, massive parallel processing (MPP) systems using distributed-memory were introduced by the end of the 1980s. The main advantage of such systems is the possibility to divide a complex job into several parts, which are executed in parallel by several processors each having dedicated memory. The communication between the parts of the main job occurs within the framework of the so-called message-passing paradigm, which was standardized in the message-passing interface (MPI). The message-passing paradigm is flexible enough to support a variety of applications and is also well adapted to the MPP architecture. During last year’s, a tremendous improvement in the performance of standard workstation processors led to their use in the MPP supercomputers, resulting in significantly lowered price/performance ratios.
Design:
Before attempting to build a cluster of any kind, think about the type of problems you are trying to solve. Different kinds of applications will actually run at different levels of performance on different kinds of clusters. Beyond the brute force characteristics of memory speed, I/O bandwidth, disk seek/latency time and bus speed on the individual nodes of your cluster, the way you connect your cluster together can have a great impact on its efficiency.
Homogeneous and Heterogeneous Clusters.
The cluster can either be made of homogeneous machines, machines that have the same hardware and software configurations or as a heterogeneous cluster with machines of different configuration. Heterogeneous clusters face problems of different performance profiles, software configuration management.
Diskless Versus “Disk full” Configurations
This decision strongly influences what kind of networking system is used. Diskless systems are by their very nature slower performers, than machines that have local disks. This is because no matter how fast the CPU is , the limiting factor on performance is how fast a program can be loaded over the network.
Network Selection.
Speed should be the criterion for selecting the network. Channel bonding, which is a software trick that allows multiple network connections to be tied, together to increase overall performance of the system can be used to increase the performance of Ethernet networks.
Security Considerations
Special considerations are involved when completing the implementation of a cluster. Even with the queue system and parallel environment, extra services are required for a cluster to function as a multi-user computational platform. These services include the well known network services NFS, NIS and rsh. NFS allows cluster nodes to share user home directories as well as installation files for the queue system and parallel environment. NIS provides correct file and process ownership across all the cluster nodes from the single source on the master machine. Although these services are significant components of a cluster, such services create numerous vulnerabilities. Thus, it would be insecure to have cluster nodes function on an open network. For these reasons, computational cluster nodes usually reside on private networks, often accessible for users only through a firewall gateway. In most cases, the firewall is configured on the master node using ipchains or iptables.
Having all cluster machines on the same private network requires them to be connected to the same switch (or linked switches) and, therefore, localized at the same proximity. This situation creates a severe limitation in terms of cluster scalability. It is impossible to combine private network machines in different geographic locations into one joint cluster, because private networks are not routable with the standard Internet Protocol (IP).
Combining cluster resources on different locations, so that users from various departments would be able to take advantage of available computational nodes, however, is possible. Theoretically, merging clusters is not only desirable but also advantageous, in the sense that different clusters are not localized at one place but are, rather, centralized. This setup provides higher availability and efficiency to clusters, and such a proposition is highly attractive. But in order to merge clusters, all the machines would have to be on a public network instead of a private one, because every single node on every cluster needs to be directly accessible from the others. If we were to do this, however, it might create insurmountable problems because of the potential--the inevitable--security breaches. We can see then that to serve scalability, we severely compromise security, but where we satisfy security concerns, scalability becomes significantly limited. Faced with such a problem, how can we make clusters scalable and, at the same time, establish a rock-solid security on the cluster networks? Enter the Virtual Private Network (VPN).
VPNs often are heralded as one of the most cutting-edge, cost-saving solutions to various applications, and they are widely deployed in the areas of security, infrastructure expansion and inter-networking. A VPN adds more dimension to networking and infrastructure because it enables private networks to be connected in secure and robust ways. Private networks generally are not accessible from the Internet and are networked only within confined locations.
The technology behind VPNs, however, changes what we have previously known about private networks. Through effective use of a VPN, we are able to connect previously unrelated private networks or individual hosts both securely and transparently. Being able to connect private networks opens a whole slew of new possibilities. With a VPN, we are not limited to resources in only one location (a single private network). We can finally take advantage of resources and information from all other private networks connected via VPN gateways, without having to largely change what we already have in our networks. In many cases, a VPN is an invaluable solution to integrate and better utilize fragmented resources.
In our environment, the VPN plays a significant role in combining high performance Linux computational clusters located on separate private networks into one large cluster. The VPN, with its power to transparently combine two private networks through an existing open network, enabled us to connect seamlessly two unrelated clusters in different physical locations. The VPN connection creates a tunnel between gateways that allows hosts on two different subnets (e.g., 192.168.1.0/24 and 192.168.5.0/24) to see each other as if they are on the same network. Thus, we were able to operate critical network services such as NFS, NIS, rsh and the queue system over two different private networks, without compromising security over the open network. Furthermore, the VPN encrypts all the data being passed through the established tunnel and makes the network more secure and less prone to malicious exploits.
The VPN solved not only the previously discussed problems with security, but it also opened a new door for scalability. Since all the cluster nodes can reside in private networks and operate through the VPN, the entire infrastructure can be better organized and the IP addresses can be efficiently managed, resulting in a more scalable and much cleaner network. Before VPNs, it was a pending problem to assign public IP addresses to every single node on the cluster, which limited the maximum number of nodes that can be added to the cluster. Now, with a VPN, our cluster can expand in greater magnitude and scale in an organized manner. As can be seen, we have successfully integrated the VPN technology to our networks and have addressed important issues of scalability, accessibility and security in cluster computing.
Architecture:
A cluster is a type of parallel or distributed processing system, which consists of
a collection of interconnected stand-alone computers working together as a single, integrated computing resource. A computer node can be a single or multiprocessor system (PCs, workstations, or SMPs) with memory, I/O facilities, and an operating system. A cluster generally refers to two or more computers (nodes) connected together. The nodes can exist Cluster Computing at a Glance in a single cabinet or be physically separated and connected via a LAN. A inter- connected (LAN-based) cluster of computers can appear as a single system to users and applications. Such a system can provide a cost-effective way to gain features and benefits (fast and reliable services) that have historically been found only on more expensive proprietary shared memory systems. The typical architecture of a cluster is shown in Figure
The following are some prominent components of cluster computers:
- Multiple High Performance Computers (PCs, Workstations, or SMPs)
- State-of-the-art Operating Systems (Layered or Micro-kernel based)
- High Performance Networks/Switches (such as Gigabit Ethernet and Myrinet)
- Network Interface Cards (NICs)
- Fast Communication Protocols and Services (such as Active and Fast Messages)
- Cluster Middleware (Single System Image (SSI) and System Availability Infrastructure) Hardware (such as Digital (DEC) Memory Channel, hardware DSM, and SMP techniques) Operating System Kernel or Gluing Layer (such as Solaris MC and GLU-nix)
- Applications and Subsystems
_ Applications (such as system management tools and electronic forms)
_ Runtime Systems (such as software DSM and parallel _le system)
_ Resource Management and Scheduling software (such as LSF (Load Sharing Facility) and CODINE (Computing in Distributed Net- worked Environments))
_ Parallel Programming Environments and Tools (such as compilers, PVM (Parallel Virtual Machine), and MPI (Message Passing Interface))
- Applications
_Sequential
_Parallel or Distributed
The network interface hardware acts as a communication processor and is responsible for transmitting and receiving packets of data between cluster nodes via a network/switch. Communication software offers a means of fast and reliable data communication among cluster nodes and to the outside world. Often, clusters with a special net- work/switch like Myrinet use communication protocols such as active messages for fast communication among its nodes. They potentially bypass the operating system and thus remove the critical communication overheads providing direct user-level access to the network interface. The cluster nodes can work collectively, as an integrated computing resource, or they can operate as individual computers. The cluster middleware is responsible for offering an illusion of a unified system image (single system image) and availability out of a collection on independent but interconnected computers. Programming environments can offer portable, efficient, and easy-to-use tools for development of applications. They include message passing libraries, debuggers. It should not be forgotten that clusters could be used for the execution of sequential or parallel applications.
Network clustering connects otherwise independent computers to work together in some coordinated fashion. Because clustering is a term used broadly, the hardware configuration of clusters varies substantially depending on the networking technologies chosen and the purpose (the so-called "computational mission") of the system. Clustering hardware comes in three basic flavors: so-called "shared disk," "mirrored disk” and” shared nothing" configurations.
Types of Cluster:
Workload Consolidation/Common Management Domain Cluster
This chart shows a simple arrangement of heterogeneous server tasks — but all are running on a single physical system (in different partitions, with different granularities of systems resource allocated to them). One of the major benefits offered by this model it that of convenient and simple systems management — a single point of control. Additionally, however, this consolidation model offers the benefit of delivering high quality of service (resources) in a cost effective manner.