Plumbing - Summer 2004
Last updated: Wednesday, 15-Sep-2004 19:36:01 EDT
The current version of this file should be kept at
http://cluster.earlham.edu/project/sna/plumbing-summer-2004.html
automagically by CVS when updates to it are committed. This file is in the
sna CVS module.
Do not remove items from these lists, rather cut and paste them to the "Completed
items follow:" portion at the end of each section. Make an entry in log.html and
include any useful or necessary technical details there and not in this file.
Items marked with a
* are particularly important (at this moment in time).
General (ordered)
- *Build and test new image for bazaar:
- Remove all fire wall modules, configurations, etc.
- system imager
- cexecs (and other tools; ckillall, etc.) installed on all nodes.
c3.conf should be in /cluster/[cluster]/etc for each cluster.
- DBI-Proxy
- Image annex.
- *Build and test new image for cairo:
- Remove all fire wall modules, configurations, etc.
- Yellow dog distro upgrade? Check.
- Is SystemImager the right tool for YD/PPC? Yes, for now it is.
- cexecs (and other tools; ckillall, etc.) installed on all nodes.
c3.conf should be in /cluster/[cluster]/etc for each cluster.
- DBI-Proxy
- *Build and test new image for athena:
- Remove all fire wall modules, configurations, etc.
- cexecs (and other tools; ckillall, etc.) installed on all nodes.
c3.conf should be in /cluster/[cluster]/etc for each cluster.
- system imager
- DBI-Proxy
- Install WordPress (and MySQL) on admin, test-drive.
- Upgrade gnuplot.
- Review test run architecture, code clean-up and review, enhancements?
- Later on (maybe during the Fall):
- Check to see what the status of kernel 2.6 and NAPI and the low
latency module.
- Cricket/Ganglia cleanup.
- Completed items follow:
Athena
- LAM-MPI 7.0.6 w/ Trillium installed in /cluster/$CLUSTER/[src & software]
- Setup for testing F@C.
- Completed items follow:
Bazaar
- LAM-MPI 7.0.6 w/ Trillium installed in /cluster/$CLUSTER/[src & software]
- Why is cexecs so slow (DNS?)
- Completed items follow:
Cairo
- LAM-MPI 7.0.6 w/ Trillium installed in /cluster/$CLUSTER/[src & software]
- RMA bad cairo drive
- What's up with the network? See dmesg on c0.
- Completed items follow:
Hopper.c.e.e/admin.c.e.e
- *bad tape in drive?
- *backup audit
- Can we do 80MB/s on both of hopper's SCSI interfaces? Currently one
of them is running at 40MB/s. Would it be better to move both drives to
one (80MB/s) channel (probably).
- Install a full version of X (client and server) on hopper.
- Are there any firewall software running on hopper? External interface,
internal interface. Make sure it's off on the internal interface.
- Completed items follow:
- joshh - Installed latex tools on hopper
- joshh - Installed aspell and ispell on hopper
Switch Fabric
Documentation (published at http://cluster.earlham.edu/generic/doc)
- Write "Building [a-c]0". Apply the image, setup routing, other stuff?
- gnuplot howto.
- Review and update as necessary all diagrams and existing documentation.
- Changing of the guard - machines, accounts, previous values
- NIS configuration, DNS configuration, DHCP configuration, SystemImager,
- Text that describes where and how to install software: one cluster (for each
cluster), all clusters, hopper.
Whenever we install a mostly standalone package (MPI, PVM, etc.) we would
create a user for it on hopper and use the software group. We would also
create a separate directory in /cluster/[cluster]/ for these mostly
standalone packages. This will necessitate having a /etc/profile on each
machine that exports fairly long PATH and LD_LIBRARY shell variables since
each of these packages is likely to have it's own bin and/or lib.
- How to add a node to a cluster (for each cluster)
- How to add a cluster
- Completed items follow: