LittleFe - The portable cluster for computational science education
Many institutions and teaching environments do not have access to parallel
platforms for parallel and distributed computing education. Teaching key
concepts such as speedup, efficiency, and load balancing are much more
effectively done on a parallel platform. LittleFe is a complete 4 to 8 node
Beowulf style portable computational cluster.
The main LittleFe site is http://LittleFe.net
There are a couple of pictures of LittleFe/x86 (Mark I) and a tracker available at http://cs.earlham.edu/little-fe
A short presentation about LittleFe/PPC (Mark II), from SIAM's Computational Science
and Engineering Conference, and pictures of the new design can be found at
Our wiki documents various aspects of the hardware design and assembly tips
if you would like to make your own LittleFe. You can find it at
The HPC Wire article about LittleFe can be found here.
The Bootable Cluster CD (BCCD) - Software tools for computational science education
We work on three different aspects of Paul Gray's BCCD project: the PowerPC port,
the Liberation project, and computational science curriculum modules.
The main BCCD site is http://bccd.cs.uni.edu
Liberation is a set of scripts that take the BCCD and extract/install
it on a hard-drive. This enables the BCCD to be used for fixed clusters as well as the
ad-hoc lab-based environments that it was originally designed for. More information about the
BCCD Liberation project can be found at http://wiki.cs.earlham.edu/index.php/BCCD:Automated_liberation.
Utilizing the BCCD's
list-packages mechanism we develop curriculum modules
for computational science. These modules are used for undergraduate classes and workshops
taken by science faculty.
The BCCD is a project of Paul Gray's lab at the University of Northern Iowa, http://bccd.cs.uni.edu
This software tool is designed to make high performance
computational resources easily available to chemists and biologists
for running simulations of large bio-molecules using open source
molecular dynamics packages.
Josh Hursey, Josh McCoy, Charles Peck, and John Schaefer gave a
presentation on this work at SuperComputing04, at the Purdue University
Research area, in November, 2004. We also presented this work as a poster at
SIAM's Computational Science and Engineering conference in February, 2005.
An article based on this work appears in the November, 2005 issue of
Dr. Dobb's Journal.
The abstract for our SC04 submission follows:
Instead of traditional, tightly coupled massively parallel computing,
current distributed computing projects such as SETI@home or Folding@Home
use a client-server model to perform embarrassingly parallel computing,
allowing for one to tap resources (hundreds of thousands of CPUs in PCs
throughout the world) impossible to obtain by other means. However, certain
algorithms could greatly benefit from a hybrid approach, combining the
massive resources available to distributed computing with the tight
coupling traditionally found only in supercomputers.
Towards this end, Folding@Clusters is an adaptive framework for
harnessing tightly coupled cluster resources for protein folding
research. It combines capability discovery, load balancing, process
monitoring, and checkpoint/re-start services to provide a platform for
molecular dynamics simulations on a range of grid-based parallel
The raw computing power available for scientific inquiry continues to
grow while the abstraction level of the tools available to scientists
does not advance in a similar manner. Folding@Clusters provides chemists,
investigating protein folding, with a high-level interface to a variety
of parallel compute architectures, e.g. lab clusters, Beowulf clusters,
large SMP machines, and clusters of SMP machines.
Folding@Clusters uses open source building blocks, such as the GROMACS
molecular dynamics package and the LAM-MPI communications library, to
provide the lowest-level functionality. Building on this foundation we
construct a three-tier architecture: cluster, node, and science core,
which provides a basis on which to abstract the process of performing a
molecular dynamics simulation. This includes work unit preparation,
distribution, and result aggregation, on a compute resource with
arbitrary capabilities (CPU speed, CPU count, memory, etc.)
Benchmarking and Tuning the GROMACS Molecular Dynamics Package on Beowulf Clusters
We are developing a model for evaluating and improving the
performance of molecular dynamics software. As part of this project
Josh Hursey, Josh McCoy and Charles Peck won the best poster award
at the Society of Industrial and Applied Mathematics Parallel
Processing conference in February, 2004. The title of their poster
was "Benchmarking and Tuning the GROMACS Molecular Dynamics Package
on Beowulf Clusters".
The abstract for our SIAM submission follows:
Beowulf clusters are a popular platform for running molecular
dynamics software such as GROMACS, a commonly used open source package.
First we describe how to configure a rigorous benchmarking environment
for GROMACS on Beowulf clusters. Then we develop tuning methodologies
and latency reduction techniques for this workload. Finally our results
demonstrate increased performance as measured by picoseconds of
simulation time per unit of wall time on two Beowulf clusters running
Our premise is that the commonly employed techniques were developed
before the advent of large and fast CPU caches, the broad availability of
SIMD instructions in commodity CPU architectures, and the increased memory
to CPU ratio found in most modern computing hardware. This calculation is
important since for most molecular dynamics simulation runs it accounts for
the bulk of the CPU time.
The abstract for our SIAM submission follows:
We examine the methods employed to calculate 1/sqrt(x) within two
popular open source molecular dynamics packages, GROMACS and NAMD. We
develop a benchmarking kernel by profiling these packages on two commodity
vector (SIMD) architectures, Intel/SSE and PowerPC/AltiVec, over a variety
of molecular systems and range of simulation parameters. Using this
environment we report on improvements to the methods currently used to
calculate 1/sqrt(x) in this context.