The idea that you can build your own supercomputer using off-the-shelf
components and even old PCs may seem far-fetched, but it's happening today with
Linux clusters.
A cluster is a group of computers bound together. The computers in the
cluster work together, usually to solve mathematical problems or to provide
support in case some computers in the cluster fail.
In the different types of clusters, the main distinction is how tightly you
couple the computers. Many operating systems support clustering, including VMS,
Windows 2000, and Linux, where the interest in clustering has surged in recent
months.
Economics is a key factor leading to the popularity of Linux clustering. When
scientific labs request bids for new systems, most want as many COTS (commercial
off-the-shelf), systems as possible. The rationale is that commercially available
and commercially supported systems will be cheaper: In the short run with the
initial purchase, and in the long run with the ability to add computers to the
cluster. And this rationale has proven true.
Clusters of off-the-shelf PCs can often meet or exceed the performance of
terribly expensive commercial supercomputers, at one-third to one-tenth of the
price. Linux has proven popular as an operating system, given that it's free.
This means that the main costs are for the computer and networking hardware. And,
using off-the-shelf components and PCs means that the hardware costs are lower
than if you purchased specialized equipment.
Most of these new Linux clusters follow a setup pioneered at the Beowulf
project. The Beowulf project provides
extensive information on how you can set up your own cluster using off-the-shelf
components and free software.
The Beowulf software environment itself is distributed as a set of software
patches and add-ons that work with most Linux distributions, although the Red Hat
distribution seems to be a common base. You can download this software from the
Beowulf download
page.
You can also download the source code to all these patches and add-ons, or
download the environment as a set of prebuilt binary packages, in RPM (Red Hat
Package Manager) format, for Red Hat Linux systems.
Under the Hood
Beowulf is a type of cluster that isn't as tightly coupled as parallel
computers (such as the parallel Cray systems) but is more tightly coupled than
plain old networks of workstations. Unlike networks of workstations, each
computer in a Beowulf cluster is dedicated to the tasks performed by the cluster.
That means each computer in the cluster only performs work assigned by the
cluster, instead of providing a general-use computer. Typically, the computers in
a Beowulf cluster are stored in a server room and you access the entire cluster
through a single machine. In a more loosely coupled network of workstations, each
computer on the network can participate in shared tasks as well as perform the
tasks done by normal computers, such as word processing.
On top of the Beowulf cluster, you can run parallel-processing software such
as Parallel Virtual Machine (PVM) or Message Passing Interface (MPI)
technologies. Both PVM and MPI provide ways for parts of a parallel system to
communicate with the other parts.
MPI is a standard used on most parallel supercomputers. An implementation of
MPI called local-area multicomputer (LAM) is highly popular in the Beowulf
community. LAM is available from the University of Notre Dame.
The Beowulf type of cluster works best for problems that can be divided into
small pieces for individual computation-in other words, the types of problems
that work well for parallel computing. These problems include predicting the
weather, rendering images from computer geometry, and molecular analysis.
Traditional software, such as relational databases or Web servers, won't get
any speed up from a Beowulf cluster without a lot of modifications to the
programs' source code. You need an application that has been written for a
parallel computing environment to really make use of Beowulf, and these
applications, while highly specialized, are readily available, especially at
academic centers.
A number of academic centers use Beowulf clusters, including the University of Minnesota at
Duluth, and the Physics department at the University
of Wisconsin-Milwaukee.
The UW-M work, for example, is devoted to the detection and study of
gravitational waves. Its Beowulf cluster analyzes gravitational data, a
computationally intensive task. Interestingly, UW-M decided on using a Beowulf
cluster after a benchmark test showed that these clusters are the most
cost-effective way to analyze the data.
Interest in commercial usage also has grown, with companies such as Doubleclick running Beowulf clusters to
analyze vast quantities of data mined from user behavior.
Part of the surge in interest comes from how well Beowulf clusters seem to
perform with very low hardware costs. But interest in Beowulf clusters really
exploded in early 1999 when IBM demonstrated a cluster of 17 Netfinity servers
(with a total of 36 Pentium II processors) and off-the-shelf versions of Red Hat
Linux.
This cluster tied the performance of a parallel Cray T3t-900-AC64
supercomputer on the POVRay benchmark, a ray-tracing benchmark used to test the
speed of rendering images. The total cost of the IBM system was about $150,000,
compared to $5.5 million for the Cray system. With these results, the economics
of Beowulf clusters became clear.
Looking at the POVRay benchmark
results, nine of the top 10 systems run Linux. (The aforementioned Cray T3,
tied for second place, runs the UNICOS operating system.)
In addition to the POVRay benchmark, Beowulf clusters have performed well on
the Linpack Benchmark, one of many benchmarks
you can run against computers or clusters of computers. According to the site,
which lists the top 500 supercomputers based on their performance on the Linpack
Benchmark, a number of supercomputers on the list are Beowulf clusters. Interest
has grown so much that there's even a Web site devoted to Beowulf news.
With all this interest, a number of companies are trying to get in on the
action. Compaq, which inherited the Alpha processor when it purchased Digital
Equipment, has provided a lot of support for Linux clusters, especially clusters
of systems using Compaq's
Alpha processors.
Furthermore, Compaq produced a special Cluster
Management Utility that allows users to more easily manage Beowulf clusters.
In addition to Compaq and IBM, a number of smaller vendors,
including Paralogic, offer prebuilt Beowulf Linux
systems.
And you can even build your own Beowulf cluster. There's also
a quick-start guide at the Xtreme Machines site. All you need are a few
Linux-compatible PCs, a fast Ethernet switch, and a good bit of experience with
Linux or UNIX system administration. This isn't a plug-and-play type of setup.
But lots of people have set up Beowulf clusters, leading to a number of
at-home supercomputing Web pages, including Cris.com. These pages describe how you can take off-the-shelf and even
old PCs to create Beowulf clusters for performing complex mathematical
computations.
This technology works best, though, with modern PCs, especially PCs sporting
two or more processors. That does raise the price for curious home users, but a
large Beowulf cluster should cost from one-third to one-tenth the price of a
commercial parallel supercomputer. Even so, this really is a technology for
organizations, not home hobbyists.
If you have computing needs that match what a Beowulf cluster can provide,
including rendering images or mathematical number crunching, then it's well worth
your while to look into this technology. If you're not familiar with Linux
administration, your best bet is likely purchasing a system from IBM, Compaq,
Paralogic, or other vendors of prebuilt Beowulf systems.
For a technical introduction to the Beowulf software and its history, see the
Beowulf introductory page. There's also a number of
FAQs lists:
www.dnaco.net/~kragen/beowulf-faq.txt
smile.cpe.ku.ac.th/tools/bwfaq2.htm.
From the latter list, you can find links to a host of applications written
for Beowulf clusters, links to compiler vendors for tools to parallelize your
programs so they take advantage of the clusters, and a lot more documents on how
to get going. Finally, if you have a Linux system and you installed the how-to
documents, you should be able to find a Beowulf how-to already online on your
system.
Contributing Editor Eric Foster-Johnson erc@pconline.com has written 15 books
on Linux, UNIX, programming, and open-source tools.
Sidebar
Reclaim Unused Processing Cycles With SETI@home
Beowulf clusters are not the only way to get work done. The SETI@home project
demonstrated another, more loosely coupled form of clustering for Linux, Windows,
or any other operating system.
SETI (the Search for Extraterrestrial Intelligence) is an ongoing effort
devoted, in this instance, mainly to examining data from radio telescopes for
evidence of intelligent signals. Examining these radio telescope signals is a
computationally intensive task, but one that can easily be broken down into
smaller pieces that can be computed independently.
Furthermore, the SETI projects don't have enough computing power to perform
the necessary tasks, nor a budget large enough to buy the needed computing power.
So, the SETI@home software attempts to enlist users on home (or office) PCs and
have each PC perform small chunks of the overall project.
Cleverly designed as a screen saver, the SETI@home software takes advantage
of unused computing cycles on your PC to download data from a central location,
perform the calculations, and upload the results.
Thousands of PCs work as part of the loose cluster during different times of
the day-that is, whenever the SETI@home screen saver is running. Depending on how
users do their work, each PC may effectively join and disengage from the cluster
many times a day.
Since it runs only as a screen saver on lots of different PCs all over the
world, the SETI@home cluster can only really be considered a loose network of
workstations, but it certainly helps get the job done (and increases awareness
about the project).
You can download the SETI@home software from its Berkeley-hosted page. The software runs on Linux, most versions of
UNIX, Windows, and Mac OS systems.