Hacker Emblem

Earlham College Cluster Computing Group

Open Source Logo


LittleFe - The portable cluster for computational science education

    Many institutions and teaching environments do not have access to parallel platforms for parallel and distributed computing education. Teaching key concepts such as speedup, efficiency, and load balancing are much more effectively done on a parallel platform. LittleFe is a complete 4 to 8 node Beowulf style portable computational cluster.

    The main LittleFe site is http://LittleFe.net

    There are a couple of pictures of LittleFe/x86 (Mark I) and a tracker available at http://cs.earlham.edu/little-fe

    A short presentation about LittleFe/PPC (Mark II), from SIAM's Computational Science and Engineering Conference, and pictures of the new design can be found at http://cluster.earlham.edu/detail/project/Little-Fe/ppc-overview/little-fe-pcc.html

    Our wiki documents various aspects of the hardware design and assembly tips if you would like to make your own LittleFe. You can find it at http://wiki.cs.earlham.edu/index.php/LittleFe_Cluster.

    The HPC Wire article about LittleFe can be found here.

The Bootable Cluster CD (BCCD) - Software tools for computational science education

    We work on three different aspects of Paul Gray's BCCD project: the PowerPC port, the Liberation project, and computational science curriculum modules.

    The main BCCD site is http://bccd.cs.uni.edu

    Liberation is a set of scripts that take the BCCD and extract/install it on a hard-drive. This enables the BCCD to be used for fixed clusters as well as the ad-hoc lab-based environments that it was originally designed for. More information about the BCCD Liberation project can be found at http://wiki.cs.earlham.edu/index.php/BCCD:Automated_liberation.

    Utilizing the BCCD's list-packages mechanism we develop curriculum modules for computational science. These modules are used for undergraduate classes and workshops taken by science faculty. The BCCD is a project of Paul Gray's lab at the University of Northern Iowa, http://bccd.cs.uni.edu


    This software tool is designed to make high performance computational resources easily available to chemists and biologists for running simulations of large bio-molecules using open source molecular dynamics packages.

    Josh Hursey, Josh McCoy, Charles Peck, and John Schaefer gave a presentation on this work at SuperComputing04, at the Purdue University Research area, in November, 2004. We also presented this work as a poster at SIAM's Computational Science and Engineering conference in February, 2005.

    An article based on this work appears in the November, 2005 issue of Dr. Dobb's Journal.

    The abstract for our SC04 submission follows:

      Instead of traditional, tightly coupled massively parallel computing, current distributed computing projects such as SETI@home or Folding@Home use a client-server model to perform embarrassingly parallel computing, allowing for one to tap resources (hundreds of thousands of CPUs in PCs throughout the world) impossible to obtain by other means. However, certain algorithms could greatly benefit from a hybrid approach, combining the massive resources available to distributed computing with the tight coupling traditionally found only in supercomputers.

      Towards this end, Folding@Clusters is an adaptive framework for harnessing tightly coupled cluster resources for protein folding research. It combines capability discovery, load balancing, process monitoring, and checkpoint/re-start services to provide a platform for molecular dynamics simulations on a range of grid-based parallel computing resources.

      The raw computing power available for scientific inquiry continues to grow while the abstraction level of the tools available to scientists does not advance in a similar manner. Folding@Clusters provides chemists, investigating protein folding, with a high-level interface to a variety of parallel compute architectures, e.g. lab clusters, Beowulf clusters, large SMP machines, and clusters of SMP machines.

      Folding@Clusters uses open source building blocks, such as the GROMACS molecular dynamics package and the LAM-MPI communications library, to provide the lowest-level functionality. Building on this foundation we construct a three-tier architecture: cluster, node, and science core, which provides a basis on which to abstract the process of performing a molecular dynamics simulation. This includes work unit preparation, distribution, and result aggregation, on a compute resource with arbitrary capabilities (CPU speed, CPU count, memory, etc.)

Benchmarking and Tuning the GROMACS Molecular Dynamics Package on Beowulf Clusters

    Project Resources

    We are developing a model for evaluating and improving the performance of molecular dynamics software. As part of this project Josh Hursey, Josh McCoy and Charles Peck won the best poster award at the Society of Industrial and Applied Mathematics Parallel Processing conference in February, 2004. The title of their poster was "Benchmarking and Tuning the GROMACS Molecular Dynamics Package on Beowulf Clusters".

    The abstract for our SIAM submission follows:

      Beowulf clusters are a popular platform for running molecular dynamics software such as GROMACS, a commonly used open source package. First we describe how to configure a rigorous benchmarking environment for GROMACS on Beowulf clusters. Then we develop tuning methodologies and latency reduction techniques for this workload. Finally our results demonstrate increased performance as measured by picoseconds of simulation time per unit of wall time on two Beowulf clusters running Linux.

Numerical Methods

    Project Resources

    Our premise is that the commonly employed techniques were developed before the advent of large and fast CPU caches, the broad availability of SIMD instructions in commodity CPU architectures, and the increased memory to CPU ratio found in most modern computing hardware. This calculation is important since for most molecular dynamics simulation runs it accounts for the bulk of the CPU time.

    The abstract for our SIAM submission follows:

      We examine the methods employed to calculate 1/sqrt(x) within two popular open source molecular dynamics packages, GROMACS and NAMD. We develop a benchmarking kernel by profiling these packages on two commodity vector (SIMD) architectures, Intel/SSE and PowerPC/AltiVec, over a variety of molecular systems and range of simulation parameters. Using this environment we report on improvements to the methods currently used to calculate 1/sqrt(x) in this context.