Cluster Information

From Earlham Cluster Department

(Difference between revisions)
Jump to: navigation, search
(SC12 Prep)
(SC12 Prep)
Line 3: Line 3:
====Friday in Richmond ====
====Friday in Richmond ====
Shopping:
Shopping:
-
# 6-32 x 3/4" pan head (how many?)
+
# #6-32 x 3/4" pan head (how many?)
# HST for wiring harness
# HST for wiring harness
 +
# MicroFe bolts and fasteners
 +
# 1" velcro
 +
# Bolts, nuts, nylon washers for WiFi/BlueTooth mount (? x 1/2" - 14)
To Do:
To Do:
 +
# Cut MicroFe plywood and plexiglass
# Organize bars (4 x 14 units)
# Organize bars (4 x 14 units)
-
# Update assembly instructions - harvest pre-assembled frames section, change network switch mounting screws
+
# Cut main board plate pads for 14 units
-
# Pre-assemble frames - identify correct bars, ease ventilation holes in center plate
+
# Finish 14? + 1 wiring harnesses
# Test BCCD release candidate
# Test BCCD release candidate
-
# Plan Buildout - include shutdown, startup, run loops
+
# Organize 15 tool kits
-
# 15 tool kits
+
# Organize extra fasteners kit and extra parts kit
 +
# Organize MicroFe parts - disk, 2 main boards, USB-Ethernet, switch plate,  
 +
# Build MicroFe
====Take to SLC====
====Take to SLC====
# Projectors, power strips, network jumpers, assembly tools (ordered 10 + 5 from lab + charlie)
# Projectors, power strips, network jumpers, assembly tools (ordered 10 + 5 from lab + charlie)
 +
# Files (ease vent holes, enlarge switch hole, clean plate slots), soldering iron and solder, drill, bits, electric driver,
 +
# Shirts from Charlie's office
====To Do Saturday in SLC====
====To Do Saturday in SLC====
 +
# Plan Buildout - include shutdown, startup, run loops
 +
# Update assembly instructions - harvest pre-assembled frames section, change network switch mounting screws
# Make 18 BCCD USB
# Make 18 BCCD USB
-
# Scout Bildout Room  
+
# Scout Buildout Room  
# Call Freeman for pick up  
# Call Freeman for pick up  
# Changes to Poster (FedEx Office Print & Ship Center, Salt Lake City, UT)//Open until 9pm
# Changes to Poster (FedEx Office Print & Ship Center, Salt Lake City, UT)//Open until 9pm
Line 25: Line 35:
====To Do Sunday in SLC====
====To Do Sunday in SLC====
# Put stickers on
# Put stickers on
 +
# Pre-assemble frame rails to plates, ease ventilation holes in center plate, enlarge switch hole on 2 hole plates
# Check inventory  
# Check inventory  
# Put serial numbers on every unit (this batch and previous) that we see, e.g. v4-# and v4a-#
# Put serial numbers on every unit (this batch and previous) that we see, e.g. v4-# and v4a-#

Revision as of 03:23, 9 November 2012

SC12 Prep

Friday in Richmond

Shopping:

  1. #6-32 x 3/4" pan head (how many?)
  2. HST for wiring harness
  3. MicroFe bolts and fasteners
  4. 1" velcro
  5. Bolts, nuts, nylon washers for WiFi/BlueTooth mount (? x 1/2" - 14)

To Do:

  1. Cut MicroFe plywood and plexiglass
  2. Organize bars (4 x 14 units)
  3. Cut main board plate pads for 14 units
  4. Finish 14? + 1 wiring harnesses
  5. Test BCCD release candidate
  6. Organize 15 tool kits
  7. Organize extra fasteners kit and extra parts kit
  8. Organize MicroFe parts - disk, 2 main boards, USB-Ethernet, switch plate,
  9. Build MicroFe

Take to SLC

  1. Projectors, power strips, network jumpers, assembly tools (ordered 10 + 5 from lab + charlie)
  2. Files (ease vent holes, enlarge switch hole, clean plate slots), soldering iron and solder, drill, bits, electric driver,
  3. Shirts from Charlie's office

To Do Saturday in SLC

  1. Plan Buildout - include shutdown, startup, run loops
  2. Update assembly instructions - harvest pre-assembled frames section, change network switch mounting screws
  3. Make 18 BCCD USB
  4. Scout Buildout Room
  5. Call Freeman for pick up
  6. Changes to Poster (FedEx Office Print & Ship Center, Salt Lake City, UT)//Open until 9pm

To Do Sunday in SLC

  1. Put stickers on
  2. Pre-assemble frame rails to plates, ease ventilation holes in center plate, enlarge switch hole on 2 hole plates
  3. Check inventory
  4. Put serial numbers on every unit (this batch and previous) that we see, e.g. v4-# and v4a-#
  5. Check one board in each kit for BIOS upgrade

Fall, 2011

26 October

21 September

LittleFe v4

Before Shipping Date for SC11

Before SC11

Before OK/PUPR:

Saturday July 30th:

Take to OK/PUPR:


MicroFe:

LF v4:

LF v3:

BW Internship

BobSCEd:

Mobeen:

Events

BW Undergraduate Petascale Institute @ NCSA - 29 May through 11 June

Introduction to Parallel Programming and Cluster Computing workshop @ UW-ISU - 25 June through 1 July

Intermediate Parallel Programming and Distributed Computing workshop @ OU-PUPR - 29 July through 6 August

Projects

Logs

Personal Schedules

Al-Salam/CCG Downtime tasks

How to use the PetaKit

  1. If you do not already have it, obtain the source for the PetaKit from the CVS repository on hopper (curriculum-modules/PetaKit).
  2. cd to the Subkits directory of PetaKit and run the area-subkit.sh to make an area subkit tarball or GalaxSee-subkit.sh to make a GalaxSee subkit tarball.
  3. scp the tarball to the target resource and unpack it.
  4. cd into the directory and run ./configure --with-mpi --with-openmp
  5. Use stat.pl and args_man to make an appropriate statistics run. See args_man for a description of predicates. Example:
perl -w stat.pl --program area --style serial,mpi,openmp,hybrid --scheduler lsf --user leemasa --problem_size 200000000000
--processes 1,2,3,4,5,6,7,8-16-64 --repetitions 10 -m -tag Sooner-strongest-newest --mpirun mpirun.lsf --ppn 8

Modifying Programs for Use with PetaKit

Old To Do

Date represents last meeting where we discussed the item

 * Nice new functionality, see 
 * Waiting on clean data to finish multiple resource displays
 * Error bars for left and right y-axes with checkboxes for each

In first box: Put initial of who is doing run

In second box: B = builds, R = runs, D = reports back to database, S = there is a good set of runs (10 per data point) for strong scaling in the database that appear on a graph, W = there is a good set of runs (10 per data point) for weak scaling in the database that appear on a graph

area under curve GalaxSee
Serial MPI OpenMP Hybrid Serial MPI OpenMP Hybrid
ACLs Sam Sam Sam Sam AW AW AW AW
BobSCEd
BigRed Sam Sam Sam Sam
Sooner Sam Sam Sam Sam
pople AW/CP AW/CP AW/CP AW/CP

Problem-sizes


 * wiki page
 * Decommission Cairo
 * Figure out how to mount on Telco Rack
 * Get pdfs of all materials -- post them on wiki
 * Get Fitz's liberation instructions into wiki
 * Get Kevin's VirtualBox instructions into wiki
 * pxe booting -- see if they booted, if you can ssh to them, if the run matrix works 
 * Send /etc/bccd-revision with each email
 * Send output of netstat -rn and /sbin/ifconfig -a with each email
 * Run Matrix
 * For the future:  scripts to boot & change bios, watchdog timer, 'test' mode in bccd, send emails about errors
 * USB scripts -- we don't need the "copy" script
 * Leaving 8:00 Wednesday
 * Brad, Sam, or Gus pick up the van around 7, bring it by loading dock outside Noyes
 * Posters -- new area runs for graphs, start implementing stats collection and OpenMP, print at small size (what is that?)
 * Take 2 LittleFes, small switch, monitor/kyb/mouse (wireless), printed matter
 * Next meeting: Saturday 6/Feb @ 3 pm

Generalized, Modular Parallel Framework

10,000 foot view of problems

this conceptual view may not reflect current code
Parent Process Sends Out Children Send Back Results Compiled By
Area function, bounds, segment size or count sum of area for specified bounds sum
GalaxSee complete array of stars, bounds (which stars to compute) an array containing the computed stars construct a new array of stars and repeat for next time step
Matrix x Matrix n rows from Matrix A and n columns from Matrix B, location of rows and cols n resulting matrix position values, their location in results matrix construct new result array

Visualizing Parallel Framework

http://cs.earlham.edu/~carrick/parallel/parallelism-approaches.png

Parallel Problem Space

Summer of Fun (2009)

An external doc for GalaxSee
Documentation for OpenSim GalaxSee

What's in the database?

GalaxSee (MPI) area-under-curve (MPI, openmpi) area-under-curve (Hybrid, openmpi)
acl0-5 bs0-5 GigE bs0-5 IB acl0-5 bs0-5 GigE bs0-5 IB acl0-5 bs0-5 GigE bs0-5 IB
np X-XX 2-20 2-48 2-48 2-12 2-48 2-48 2-20 2-48 2-48

What works so far? B = builds, R = runs, W = works

area under curve GalaxSee (standalone)
Serial MPI OpenMP Hybrid Serial MPI OpenMP Hybrid
acls BRW BRW BRW BRW BR
bobsced0 BRW BRW BRW BRW BR
c13 BR
BigRed BRW BRW BRW BRW
Sooner BRW BRW BRW BRW
pople
Charlie's laptop BR

To Do

Implementations of area under the curve

GalaxSee Goals

GalaxSee - scale to petascale with MPI and OpenMP hybrid.

LittleFe

Notes from May 21, 2009 Review

BobSCEd Upgrade

Build a new image for BobSCEd:

  1. One of the Suse versions supported for Gaussian09 on EM64T [v11.1] - Red Hat Enterprise Linux 5.3; SuSE Linux 9.3, 10.3, 11.1; or SuSE Linux Enterprise 10 (see G09 platform list) <-- CentOS 5.3 runs Gaussian binaries for RHEL ok
  2. Firmware update?
  3. C3 tools and configuration [v4.0.1]
  4. Ganglia and configuration [v3.1.2]
  5. PBS and configuration [v2.3.16]
  6. /cluster/bobsced local to bs0
  7. /cluster/... passed-through to compute nodes
  8. Large local scratch space on each node
  9. Gaussian09
  10. WebMO and configuration [v9.1] - Gamess, Gaussian, Mopac, Tinker
  11. Infiniband and configuration
  12. GNU toolchain with OpenMPI and MPICH [GCC v4.4.0], [OpenMPI v1.3.2] [MPICH v1.2.7p1]
  13. Intel toolchain with OpenMPI and native libraries
  14. Sage with do-dads (see Charlie)
  15. Systemimager for the client nodes?

Installed:

Fix the broken nodes.

(Old) To Do

BCCD Liberation

Curriculum Modules

LittleFe

Infrastructure

SC Education

Current Projects

Past Projects

General Stuff

Items Particular to a Specific Cluster

Curriculum Modules

Possible Future Projects

Archive

Personal tools
Namespaces
Variants
Actions
websites
wiki
this semester
Toolbox