Optical Interconnects for Scalable High-Performance Computing Systems

Ahmed Louri

Professor, Department of Electrical and Computer Engineering

University of Arizona

Abstract

The quest for processing speed in the range of Teraflops and beyond for a wide set of applications has accelerated the need for scalable high-performance parallel computing systems. For these machines to gain widespread use, they must run a variety of applications efficiently without imposing excessive programming difficulty and prohibitive costs. To achieve this, it is believed that these architectures must provide scalability to support hundreds to thousands of processors. Scalability dictates that the size of the system (e.g., the number of processors) can be increased with nominal or no change in the existing configuration, and the increase in system size is expected to result in a linear or near-linear increase in performance. One of the major problems facing the design of such scalable computing systems is an adequate interconnection network that can deliver the ever increasing bandwidth required for inter-processor and processor-memory communications. This fabric significantly affects performance, cost and scalability. The design of such an interconnection network is posing a major challenge for system designers. The problem is as follows: In order to reduce cost and provide the highest performance, parallel computers must utilize state-of-the-art off-the-shelf high-performance processors. However, the increases in processor speed and the growing performance gap between processor technology and conventional metal interconnection technology coupled with the fundamental physical limitations of metallic-based interconnect at higher data rates are resulting in interconnection networks that have limited bandwidth, limited scalability, longer communication latencies, higher power requirements, and higher costs.

One viable solution that has been recognized to have the potential to solve communication problems of high-performance computing systems is the use of optical interconnects. Optical interconnects could lead to wiring design simplification, reduction in power, large bandwidth, lower signal delays, and much lower communication latencies. This talk highlights some of the fundamental communication problems facing high-performance computing and presents our research efforts in applying optical interconnects to the design of scalable parallel computing systems. Specifically, we present the design and analysis of our proposed network architecture called RAPID: Reconfigurable, All-Photonic Interconnection network for Distributed and parallel computers. RAPID has been designed with an integrated approach that aggressively combines the unique advantages of wavelength and space division multiplexing of optical technology with architectural innovations. The goal of RAPID is to provide a highly parallel architecture that can scale to a large number of processors while delivering scalable bandwidth, very low latency, and a much reduced network cost. We evaluated RAPID based on network characteristics, power budget criteria, and simulation using benchmark suites, and dynamic traffic loads and compared it against several other popular networks. The results indicate that RAPID outperforms all these networks. After presenting the performance evaluation results, we conclude the presentation with future directions of this research and the overall impact of the deployment of optical interconnects in next generation computing systems.