UNIVERSITIES GET ACCESS TO BIG CLUSTER COMPUTERS
By Elise Ackerman
Mercury News
San Jose Mercury News
Article Launched:
As with many things at Google, its alliance with IBM to supply massive amounts of computing power to college campuses around the country started as one individual's idealistic notion.
The two technology giants announced Monday that they would team up to provide hardware, software and services needed to teach large-scale distributed computing, sometimes called cloud computing.
The Academic Cluster Computing Initiative is the dream of a 26-year-old graduate from the University of Washington named Christophe Bisciglia, who got a job at Google five years ago as a software engineer. Like his fellow Googlers, Bisciglia embraced the Google ethos that computers and software can be used help make the world a better place.
Bisciglia's role in recruiting young college programmers gave him an idea that quickly took over his "twenty-percent time," the unstructured portion of the work week that Googlers are encouraged to spend on side projects that could benefit the company.
"The academic community has given Google so much - I wanted to think about something Google could do to give back," Bisciglia said.
Bisciglia decided to improve the computer science curricula at colleges around the country. From his perspective, even the best programs had a glaring weakness - they didn't teach students how to solve problems involving massive clusters of computers and terabytesof data (one terabyte equals 1 trillion bytes).
Bisciglia bounced his idea off his University of Washington mentor, Ed Lazowska, who holds the Bill & Melinda Gates Chair of Computer Science & Engineering. Lazowska agreed to help Bisciglia create a course.
"There are an entirely new class of problems that people are tackling that are characterized by enormous amounts of data," Lazowska said. "All of modern science and engineering is going in this direction."
But solving the problems required more computer infrastructure than individual computer science departments could afford.
Bisciglia wanted to figure out a way for Google to provide the computing power not just to the University of Washington but to colleges around the country by using the Internet.
As a first step, Bisciglia taught "Problem Solving on Large Scale Clusters" in the winter and spring of 2007 at the University of Washington with two other Googlers whom he drafted for the project. During the summer, two of his students interned at Google and helped clean up the course material.
By that time, Google Chief Executive Eric Schmidt had gotten on board and recruited support from a fellow tech titan, Samuel Palmisano, chief executive of International Business Machines.
On Monday, the two companies announced they would dedicate a cluster of several hundred computers to train computer science students in large-scale computing practices. The clusters, which will be in data centers at Google, IBMAlmadenResearchCenter in San Jose and the University of Washington in Seattle, are expected to eventually include more than 1600 processors. The companies plan to spend tens of millions of dollars on the initiative.
"We are aiming to train tomorrow's programmers to write software that can support a tidal wave of global Web growth and trillions of secure transactions every day," Palmisano said in a statement.
Six universities will initially participate in a pilot program to work out the kinks: the University of Washington, Stanford University, University of California-Berkeley, Carnegie Mellon University, Massachusetts Institute of Technology and University of Maryland. The program will use open-source software, including Google's computing infrastructure from Apache's Hadoop project.
The practice of running programs on super-powerful clusters of computers in remote data centers is often called "cloud computing," an expression of the notion that computing power will soon become available as a utility, like electric power or water.
Companies like Amazon, Microsoft, Google and IBM believe cloud computing is a growing business opportunity, and they are worried about a shortage of scientists with the necessary technical skills.
Lazowska said corporate concern is driving Google and IBM's philanthropic effort to the benefit of American universities. "Obviously there is self-interest at work, but it is enlightened self-interest," he said.
Maryland Joins Megacomputer 'Cloud' Project
By Zachary A. Goldfarb
Washington Post Staff Writer
Tuesday, October 9, 2007; D04
It's called a cloud, and the University of Maryland is happy to be part of it.
Google and IBM announced yesterday they are partnering with Maryland, Stanford, the Massachusetts Institute of Technology and three other universities to bring the power of huge clusters of computers -- or clouds -- to students and academic researchers.
"This is going to be the paradigm of the future," said Jimmy Lin, an assistant professor of information studies who is leading the initiative at Maryland. "The amount of information out there is growing at an exponential pace, and this way of doing computing is the only realistic way of keeping up with that."
Computer scientists say it is crucial for students to learn how to write software that can take advantage of clouds. The clusters -- dozens, hundreds or even thousands of computers processing information simultaneously -- have far-reaching applications in search, social networking and e-commerce. They enable users to sort through large quantities of data at light speed.
Google and IBM are making available up to 1,600 computers, in three locations, to the universities.
At Maryland, the cloud will be used to create a system for automatically translating text in difficult foreign languages such as Chinese and Arabic.
To build the system, Lin and colleagues plan to feed enormous amounts of foreign text and its English translation into a computer, which then analyzes the connections among the words to create rules for translations. In the past, he'd feed a batch -- hundreds of millions of words -- into a computer one morning, and then return the next day to see what the computer had come up with.
With the cloud, he said, "things that used to take a day to run now take about 20 minutes."
V.S. Subrahmanian, director of the Institute for Advanced Computer Studies at Maryland, said being part of the initiative "is a significant opportunity for us to test out algorithms we develop in the lab in a real-world, massive-scale setting."
He added that being part of the initiative raises Maryland's profile among the top computer science programs in the country. Other universities participating include Carnegie Mellon University, the University of California at Berkeley and the University of Washington.
"We're trying to help students develop new technologies and methods that will help them break the single-server mindset," said Christophe Bisciglia, a senior software engineer at Google who is helping run the initiative.
In early tests of the cloud at the University of Washington, students showed on a map where news around the world was happening -- almost in real time. They also analyzed all of Wikipedia to find synonyms for words, and created a video illustrating how the collision between the Milky Way and Andromeda galaxies would affect the 80,000 stars and galactic objects within them.
Michael R. Nelson, director of Internet technology and strategy at IBM, said he imagined future computers being smaller than a wristwatch. "You'd talk into this, your voice would be carried into the cloud, the cloud would do the voice recognition, and it would determine what you were asking for -- map directions, the nearest Chinese restaurant -- and then the information would come back to and be spoken through the speaker."