USA India
Home Articles UserTV Press Releases Dictionary Books Education Careers B-Channels Resources Forums Blogs Classifieds
Sunday 6 Jul, 2008 eNewsletter Register Login
Linux
Linux Home
Linux Advisor
Linux Articles
Software Vendors
Resources
User Groups
ISP Directory
Consultants Directory
Advertiser Directory
Case Studies/White papers
Tutorials
Seminars
Events
Links
Downloads
Forums
Linux
IMac
Useful Links
 
 
 
  June 2000

On Topic - Past Articles
Beowulf Provides Strength in Numbers
Linux clustering means cheap supercomputing

By By Eric Foster-Johnson

The idea that you can build your own supercomputer using off-the-shelf components and even old PCs may seem far-fetched, but it's happening today with Linux clusters.


A cluster is a group of computers bound together. The computers in the cluster work together, usually to solve mathematical problems or to provide support in case some computers in the cluster fail.


In the different types of clusters, the main distinction is how tightly you couple the computers. Many operating systems support clustering, including VMS, Windows 2000, and Linux, where the interest in clustering has surged in recent months.


Economics is a key factor leading to the popularity of Linux clustering. When scientific labs request bids for new systems, most want as many COTS (commercial off-the-shelf), systems as possible. The rationale is that commercially available and commercially supported systems will be cheaper: In the short run with the initial purchase, and in the long run with the ability to add computers to the cluster. And this rationale has proven true.


Clusters of off-the-shelf PCs can often meet or exceed the performance of terribly expensive commercial supercomputers, at one-third to one-tenth of the price. Linux has proven popular as an operating system, given that it's free. This means that the main costs are for the computer and networking hardware. And, using off-the-shelf components and PCs means that the hardware costs are lower than if you purchased specialized equipment.


Most of these new Linux clusters follow a setup pioneered at the Beowulf project. The Beowulf project provides extensive information on how you can set up your own cluster using off-the-shelf components and free software.


The Beowulf software environment itself is distributed as a set of software patches and add-ons that work with most Linux distributions, although the Red Hat distribution seems to be a common base.


You can also download the source code to all these patches and add-ons, or download the environment as a set of prebuilt binary packages, in RPM (Red Hat Package Manager) format, for Red Hat Linux systems.


Beowulf is a type of cluster that isn't as tightly coupled as parallel computers (such as the parallel Cray systems) but is more tightly coupled than plain old networks of workstations. Unlike networks of workstations, each computer in a Beowulf cluster is dedicated to the tasks performed by the cluster.


That means each computer in the cluster only performs work assigned by the cluster, instead of providing a general-use computer. Typically, the computers in a Beowulf cluster are stored in a server room and you access the entire cluster through a single machine. In a more loosely coupled network of workstations, each computer on the network can participate in shared tasks as well as perform the tasks done by normal computers, such as word processing.


On top of the Beowulf cluster, you can run parallel-processing software such as Parallel Virtual Machine (PVM) or Message Passing Interface (MPI) technologies. Both PVM and MPI provide ways for parts of a parallel system to communicate with the other parts.


MPI is a standard used on most parallel supercomputers. An implementation of MPI called local-area multicomputer (LAM) is highly popular in the Beowulf community. LAM is available from the University of Notre Dame.


The Beowulf type of cluster works best for problems that can be divided into small pieces for individual computation-in other words, the types of problems that work well for parallel computing. These problems include predicting the weather, rendering images from computer geometry, and molecular analysis.


Traditional software, such as relational databases or Web servers, won't get any speed up from a Beowulf cluster without a lot of modifications to the programs' source code. You need an application that has been written for a parallel computing environment to really make use of Beowulf, and these applications, while highly specialized, are readily available, especially at academic centers.


A number of academic centers use Beowulf clusters, including the University of Minnesota at Duluth, and the Physics department at the University of Wisconsin-Milwaukee.


The UW-M work, for example, is devoted to the detection and study of gravitational waves. Its Beowulf cluster analyzes gravitational data, a computationally intensive task. Interestingly, UW-M decided on using a Beowulf cluster after a benchmark test showed that these clusters are the most cost-effective way to analyze the data.


Interest in commercial usage also has grown, with companies such as Doubleclick running Beowulf clusters to analyze vast quantities of data mined from user behavior.


Part of the surge in interest comes from how well Beowulf clusters seem to perform with very low hardware costs. But interest in Beowulf clusters really exploded in early 1999 when IBM demonstrated a cluster of 17 Netfinity servers (with a total of 36 Pentium II processors) and off-the-shelf versions of Red Hat Linux.


This cluster tied the performance of a parallel Cray T3t-900-AC64 supercomputer on the POVRay benchmark, a ray-tracing benchmark used to test the speed of rendering images. The total cost of the IBM system was about $150,000, compared to $5.5 million for the Cray system. With these results, the economics of Beowulf clusters became clear.


Looking at the POVRay benchmark results, nine of the top 10 systems run Linux. (The aforementioned Cray T3, tied for second place, runs the UNICOS operating system.)


In addition to the POVRay benchmark, Beowulf clusters have performed well on the Linpack Benchmark, one of many benchmarks you can run against computers or clusters of computers. According to the site, which lists the top 500 supercomputers based on their performance on the Linpack Benchmark, a number of supercomputers on the list are Beowulf clusters. Interest has grown so much that there's even a Web site devoted to Beowulf news.


With all this interest, a number of companies are trying to get in on the action. Compaq, which inherited the Alpha processor when it purchased Digital Equipment, has provided a lot of support for Linux clusters, especially clusters of systems using Compaq's Alpha processors.


Furthermore, Compaq produced a special Cluster Management Utility that allows users to more easily manage Beowulf clusters.


 
Copyright © 2008 ComputerUser Inc.
About us | Terms of use | Privacy Policy | Legal | Trademark/Copyright | Awards | Advertise | Writer guidelines | Sitemap | Contact | FAQ's | Feedback  | Link to us

Here are the topics we cover computer certification computer careers computer training computer games consulting data recovery data security digital entertainment emerging technology gadget reviews handheld computers hardware reviews home automation home networks home office how-to advice internet linux local companies local news local profiles macintosh mp3 players network security online music online security open-source small-business technology soho software reviews technology books technology dictionary vpn web site reviews wi-fi windows wireless technology tech articles tech news press releases tech dictionary education resources career solutions create your personal blog upload your videos become a writer usergroups special interest group SIG 3com cipts adobe adobe certified expert apc ncpi apple achds acpt acsa actc avaya bea 8.1 certified administrator 8.1 certified architect 8.1 certified developer 9 certified administrator bicsi rcdd checkpoint ccmse ccsa ccsa ngx ccse ccse ng plus with ai ccse ngx cisco access routing and lan switching ccda ccdp ccie ccip ccna ccnp ccnp old ccsp ccvp crmam ip communications optical proctored exams for validating knowledge sales specialist storage networking vpn and security wireless lan citrix cca 3.0 cca 4.0 cca 4.5 cca xp ccea 3.0 ccea 4.0 ccea xp ccia ciw ciw associate ciw certified instructor master ciw admin master ciw designer master ciw enterprise developer security analyst comptia a+ network+ security+ server+ computer associates ca cusa cuse cwna cwna cwsp dell eccouncil cea cep certified ethical hacker chfi e-commerce architect emc emc specialist implemenation technology foundations enterasys ese eta exam express exin exin itil extreme networks ena ens filemaker f7cd f8cd fortinet fortigate foundry cne fujitsu fujitsu guidance software ence hdi css hda hdm hdsa hitachi hitachi certified professional hp ais apc app aps ase certified systems developer csa cse master ase huawei hcne hyperion hcp ibm advanced deployment professional advanced technical expert application developer business process analyst certified administrator certified advanced system administrator certified advanced technical expert certified associate developer certified enterprise developer certified solution designer certified specialist certified systems expert database administrator db2 deployment professional enterprise developer eserver certified specialist ibm on demand business solution advisor solution designer solutions developer solutions expert storage administrator system administator iisfa cifi intel isaca cisa isc cissp sscp iseb itil ism cpm juniper jncia jncis legato lcaa lcea lotus clp lpi lpic level 1 lpic level 2 lpic level 3 macromedia mcafee mcdata csnd microsoft crm mbs mcad .net mcdba mcdst mcitp mcp mcpd mcsa longhorn mcsa 2003 mcsa 2008 mcsd .net mcse mcse 2000 security mcse 2000 to mcse 2003 upgrade mcse 2003 mcse 2003 messaging mcse 2003 security mcse 2008 mcts microsoft business solutions microsoft partner competency mile2 cnsa network appliance nac-na nac-nie naca nace nacp network general sniffer certified professional nokia nokia security administrator nortel ncde ncds ncse ncss ncts novell5 cna 5 cne 6 cna 6 cne 6.5 cne cne upgrade omg ocup oracle 10g dba 10g oca 11i 8i dba 9i dba 9i internet application developer oca ocp8 to ocp8i dba upgrade exam pmi project management professional polycom pcve redhat rhce rhct sair sas institute sas scp saas scp snia snia certified architect snia certified professional snia certified systems engineer snia storage networking certification program administrator professional associate symantec scse scsp scta scts teradata tca v2r5 tcad v2r5 tcda v2r5 tcis v2r5 tcm v2r5 tcp v2r5 tia ccnt ctp tibco tcp trusecure ticsa veritas infraguard chamber of commerce vcp vmware certified professional webex linkedin facebook myspace Professional page layout, image editing, vector illustration, and print production Website design, development, prototyping, and blogging Creation of rich interactive content Industry-standard visual effects and motion graphics Video capture, editing, and production; DVD titling; and digital audio, Adobe Photoshop CS3 extended, Adobe illustrator CS3,Adobe indesign CS3,Adobe Acrobat 8 Professional, Adobe Flash CS3 Professional, Adobe Dreamweaver CS3,Adobe Contribute CS3,Adobe Fireworks CS3,Adobe After Effects CS3 Professional, Adobe Premiere Pro CS3,Adobe Soundbooth CS3,Adobe Encore CS3,Adobe OnLocation,Adobe Bridge CS3,Adobe Version Cue CS3,Adobe Device Central CS3,Adobe Stock Photos, Intel Pentium 4 (1.4GHz processor for DV; 3.4GHz processor for HDV), Intel Centrino, Intel Xeon, (dual 2.8GHz processors for HD), or Intel Core, Duo (or compatible) processor; SSE2-enabled processor required for AMD systems Microsoft Windows XP with Service Pack 2 or Microsoft Windows Vista Home Premium, Business, Ultimate, or Enterprise (certified for 32-bit editions) 1GB of RAM for DV; 2GB of RAM for HDV and HD; more RAM recommended when running multiple components 10GB of available hard-disk space (additional free space required during installation) Dedicated 7,200 RPM hard drive for DV and HDV editing; striped disk array storage (RAID 0) for HD; SCSI disk subsystem preferred Microsoft DirectX compatible sound card (multichannel ASIO-compatible sound card recommended),1,280x1,024 monitor resolution with 32-bit color adapter Blu-ray burner required for Blu-ray Disc creation OHCI compatible IEEE 1394 port for DV and HDV capture, export to tape, and transmit to DV device QuickTime 7.1.2 software required to use QuickTime features Broadband Internet connection required for Adobe Stock Photos* and other services