Professor Michael Resch, Director, High Performance Computing Center Stuttgart

Microsoft Server Product Portfolio
Customer Solution Case Study
/ Computing Center Improves Energy Efficiency with New Diskless Cluster

“The integration of a Windows HPC Server 2008 cluster further extends our portfolio and substantially improves our services for research and industrial users.”

The High Performance Computing Center Stuttgart (HLRS) provides computing resources to internal and external research and industry partners. The institute uses diskless boot technology for all of its 700 compute nodes to implement a compute cluster with Windows® HPC Server 2008. HLRS has found that using diskless boot technology in combination with the latest cluster technology leads to significant savings, thanks to the high energy efficiency.

This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Document published November 2009

Business Needs

Founded in 1995, the High Performance Computing Center Stuttgart (HLRS) at the University of Stuttgart works with scientific and industry partners to support researchers by providing supercomputing services. HLRS conducts basic and applied research with projects that include aerodynamic and crash simulations for the automotive industry, testing of engine combustion, visual analysis of simulation data, and augmented reality visualization.

HLRS offers a range of computing platforms for researchers to choose from so that people can use the cluster that best meets their needs. However, until recently, the organization didn’t have a way to support those who wanted to take advantage of high-performance computing (HPC) and do so using the Windows® operating system.

Solution

In February 2009, to respond to the demand for Windows-based HPC, the institute decided to deploy a computer cluster based on the Windows HPC Server 2008 operating system. HLRS decided to implement its new cluster using diskless compute nodes. In this scenario, all nodes startup remotely from a centralized network storage location each time they are used. After the operating system image is loaded onto a compute node, it becomes available to the HPC head node’s resource manager.

Hard drives proved to be a key challenge for HLRS. Although the acquisition costs for hard drives are quite low, it is against HLRS policy to deploy them. “First, we have to install them physically and replace them in the case of a defect, which is costly,” says Dr. Uwe Woessner, Head of the Visualization Department at HLRS. “Second, they need power; if you multiply a few watts by the total projected number of cluster nodes, the outcome is a great deal of energy. Third, hard drives radiate heat, which not only needs to be dissipated, but also needs cooling, leading to additional costs.”

While the underlying technology in Windows HPC Server 2008 makes it possible to conduct a widescale network deployment of the operating system to remote computers’ local storage devices, it does not offer built-in support for diskless startup over the network from a remote storage location. Together with Microsoft and its hardware partner NEC, HLRS looked for a third-party solution to provide nodes with remote startup capabilities and opted to use Citrix Provisioning Server 5.0. Infiniband communication links cater sufficient bandwidth to the central storage location, and HLRS also uses Gigabit Ethernet for management purposes.

Getting closer to the final implementation was a step-by-step process. “Starting up just one PC over the network is easy, but handling several hundred PCs at once is a totally different story,” says Woessner. Woessner’s team therefore undertook a gradual approach to detect and remove bottlenecks. For instance, the team performed basic tests to ensure that the correct network is chosen for provisioning and message passing interface (MPI) communication.

The biggest challenge was to ensure the speed of the hard drives in the provisioning servers that deliver the image to all compute nodes. After four weeks of work, NEC system specialists, the Microsoft® HPC support team, and the HLRS staff surmounted all the difficulties that confronted them. “Starting up all of our 700 diskless compute nodes over the network from a remote storage now works at the touch of a button,” reports Woessner.

Benefits

By introducing Windows HPC Server 2008 to its compute cluster environment, HLRS is expanding the range of options available for its customer base. “We have researchers who ask specifically for Windows computation power,” says Woessner. “Our new cluster from Microsoft helps us not only address these needs, but do so in a highly economic manner.”

Adds Professor Michael Resch, Director of the High Performance Computing Center Stuttgart, “HLRS has a long tradition of providing its users with the best solution for their individual problems. The integration of a Windows HPC Server 2008 cluster further extends our portfolio and substantially improves our services for research and industrial users.” Benefits include:

Outstanding speed. HLRS has found that its Windows HPC Server 2008 cluster delivers the performance that researchers expect. In terms of raw speed, it delivers a peak performance of 62 teraflops, leading to rank 77 in the TOP500 list[1] published in June 2009.

Energy efficiency. HLRS believes that the low power consumption of its new cluster is even more impressive than the performance. In the June 2009 edition of the Green500 list,[2] the Windows HPC Server 2008 cluster at HLRS ranked 20. HLRS achieved the high degree of efficiency for its cluster by relinquishing the use of local hard drives in favor of diskless booting and utilizing highly efficient power supplies in the NEC compute nodes. “The diskless compute nodes in our Windows HPC Server 2008 cluster make it possible for us to use the required number of nodes at a click of the mouse and work energy efficiently,” says Woessner.

Easy administration. For HLRS, starting up and maintaining compute nodes is easy and can be performed from a central location. After marking all desired nodes in the management console, HLRS can simply press a button to initiate the boot process for the remote computers. The organization benefits from the fact that all nodes use the same image for startup, which makes for straightforward administration.

Update management, which can be labor-intensive in traditional computing environments, is also quite simple for HLRS. “We have to install service packs, updates, and so on just once in order to apply the software to all compute nodes of our Windows HPC Server 2008 cluster,” says Woessner.

This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Document published November 2009

[1]

[2]
06&year=2009&list=green500_200906.csv&start=1&line=101