Cluster: New BobSCEd Install Log

From Earlham Cluster Department

(Difference between revisions)
Jump to: navigation, search
(Cluster Image Review)
Line 182: Line 182:
| mpich2 ||OK, must setup mpd ring||OK, must set up mdp ring||OK, must set up mdp ring||OK, must set up mdp ring||N/A (mpd ring)||Requires chmod 600 .mpd.conf in home directory with MPD_SECRETWORD=somevalue; setup ring with <code><nowiki>sort $PBS_NODEFILE | uniq -c | awk '{print $2":"$1}' > /cluster/home/kwanous/tmp/mpd.nodes; mpdboot -f /cluster/home/kwanous/tmp/mpd.nodes -n 2</nowiki></code>
| mpich2 ||OK, must setup mpd ring||OK, must set up mdp ring||OK, must set up mdp ring||OK, must set up mdp ring||N/A (mpd ring)||Requires chmod 600 .mpd.conf in home directory with MPD_SECRETWORD=somevalue; setup ring with <code><nowiki>sort $PBS_NODEFILE | uniq -c | awk '{print $2":"$1}' > /cluster/home/kwanous/tmp/mpd.nodes; mpdboot -f /cluster/home/kwanous/tmp/mpd.nodes -n 2</nowiki></code>
|-
|-
-
| openmpi 1.3.1||||||||||
+
| openmpi 1.*||OK (can send machinefile)||OK (can send machinefile)||All runs on one node (can send machinefile)||||
-
|-
+
-
| openmpi 1.3.3||||||||||
+
|}
|}
 +
 +
Submission scripts:<br>
 +
'''Mpich1'''
 +
 +
'''Mpich2'''
 +
<pre>
 +
#!/bin/bash
 +
#PBS -N testmpich2
 +
#PBS -l cput=00:60:00
 +
#PBS -l nodes=2:ppn=4
 +
 +
. /etc/profile.d/modules.sh
 +
module load modules modules-init modules-bobsced
 +
module load mpich2
 +
 +
sort $PBS_NODEFILE | uniq -c | awk '{print $2":"$1}' > /cluster/home/kwanous/tmp/mpd.nodes
 +
 +
mpdboot -f /cluster/home/kwanous/tmp/mpd.nodes -n 2
 +
mpiexec -np 9 /cluster/home/kwanous/a.out
 +
mpdallexit
 +
</pre>
 +
 +
'''Mpich2 - OSC'''
 +
<pre>
 +
#!/bin/bash
 +
#PBS -N testmpich1
 +
#PBS -l cput=00:60:00
 +
#PBS -l nodes=2:ppn=4
 +
 +
hostname
 +
. /etc/profile.d/modules.sh
 +
module load modules modules-init modules-bobsced
 +
module load mpich2
 +
mpirun -np 8 /cluster/home/kwanous/a.out</pre>
 +
 +
'''Openmpi'''

Revision as of 23:34, 21 April 2010

The source code for *anything* installed locally is in /usr/local/src. The source for *anything* installed on NFS is in /mounts/bobsced/usr/local/src.

Contents

Modules Software

Intel Compilers

Openmpi

MPICH

./configure --prefix=/mounts/bobsced/usr/local/modules-sw/mpich2/2.1.2 --enable-cxx --enable-f90 --enable-f77 --enable-threads=multiple --with-thread-package=posix

Using Modules

Important commands -

Log

Green color indicates something that still needs to be done.

Cloning

Head Node

Yum installed:

Install C3 tools from http://www.csm.ornl.gov/torc/C3/C3softwarepage.shtml

Ganglia

Networking

static_routes="bs0"
route_bs0="192.168.0.1 159.28.234.200"

Modules

Torque

Maui

Intel Firmware Updates

Mail

NFS

WebMO

Path to perl:         /usr/bin/perl
Webserver name:       bs0-new.cluster.earlham.edu
HTML directory:       /var/www/webmo
HTML URL:             /webmo
CGI script directory: /var/www/cgi-bin
CGI script URL:       /cgi-bin
User files directory: /mounts/bobsced/WebMO
SuexecUserGroup bob users
Erroneous write during file extend. write 160 instead of 4096
Probably out of disk space.
Write error in NtrExt1: No such file or directory

or

Write error in NtrExt1: Bad address

Infiniband

The 2.6.18-164.el5 kernel is installed, but do not have drivers available. 
Cannot continue.
ln -s /usr/mst/lib/2.6.18-128.el5/ /usr/mst/lib/2.6.18-164.2.1.e15.plus
[root@bs0-new ~]# mst start
    Starting MST (Mellanox Software Tools) driver set: 
Loading MST PCI module                                     [  OK  ]
Loading MST PCI configuration module                       [  OK  ]
Saving configuration for PCI device 01:00.0                [  OK  ]
Create devices

Cluster Image Review

' Command Line -np 8 Command Line -np 9 Torque -np 8 Torque -np 9 Machinefile Notes
mpich1OK, except allocates 1 process to bs0OK, except allocates 1 process to bs0OKOK/mounts/bobsced/usr/local/modules-sw/mpich1/1.2.7p1/share/machines.LINUXCreates temporary file PIxxxxx while running under qsub; MPI does not respect qsub # of nodes given
mpich2-oscN/AN/AOK, MUST specify # nodes, ppnOK, MUST specify # nodes, ppnGenerated by PBSCurrently uses OSC's pbs-specific mpiexec (cannot be run outside of qsub)
mpich2 OK, must setup mpd ringOK, must set up mdp ringOK, must set up mdp ringOK, must set up mdp ringN/A (mpd ring)Requires chmod 600 .mpd.conf in home directory with MPD_SECRETWORD=somevalue; setup ring with sort $PBS_NODEFILE | uniq -c | awk '{print $2":"$1}' > /cluster/home/kwanous/tmp/mpd.nodes; mpdboot -f /cluster/home/kwanous/tmp/mpd.nodes -n 2
openmpi 1.*OK (can send machinefile)OK (can send machinefile)All runs on one node (can send machinefile)

Submission scripts:
Mpich1

Mpich2

#!/bin/bash
#PBS -N testmpich2
#PBS -l cput=00:60:00
#PBS -l nodes=2:ppn=4

. /etc/profile.d/modules.sh
module load modules modules-init modules-bobsced
module load mpich2

sort $PBS_NODEFILE | uniq -c | awk '{print $2":"$1}' > /cluster/home/kwanous/tmp/mpd.nodes

mpdboot -f /cluster/home/kwanous/tmp/mpd.nodes -n 2
mpiexec -np 9 /cluster/home/kwanous/a.out
mpdallexit

Mpich2 - OSC

#!/bin/bash
#PBS -N testmpich1
#PBS -l cput=00:60:00
#PBS -l nodes=2:ppn=4

hostname
. /etc/profile.d/modules.sh
module load modules modules-init modules-bobsced
module load mpich2
mpirun -np 8 /cluster/home/kwanous/a.out

Openmpi

Personal tools
Namespaces
Variants
Actions
websites
wiki
this semester
Toolbox