Al-salam

From Earlham Cluster Department

(Difference between revisions)
Jump to: navigation, search
(compute nodes)
(Have done)
 
(13 intermediate revisions not shown)
Line 3: Line 3:
== Installation Notes ==
== Installation Notes ==
=== headnode ===
=== headnode ===
 +
I'll be maintaining a script, <tt>/root/install/al-salam.sh</tt>, that will also serve as a log. Also following along with [[Cluster: New BobSCEd Install Log#Head Node|BobSCEd-new logs]] for consistency between clusters.
 +
 +
==== TODO ====
 +
* MPI
 +
* Software installations into /cluster/al-salam
 +
* User auth via bs0-new's ldap
 +
* torque/maui
 +
* Ganglia
 +
* shorewall
 +
* modules
 +
 +
==== Have done ====
 +
 +
* yum install:
 +
gcc.x86_64 gcc-c++.x86_64 gcc-gfortran.x86_64 \
 +
gcc44.x86_64 gcc44-c++.x86_64 gcc44-gfortran.x86_64 \
 +
apr-x86_64 apr-devel.x86_64 expat-devel.x86_64 \
 +
blas.x86_64 dhcp.x86_64
 +
* rpm install:
 +
** c3
 +
** libconfuse
 +
** libconfuse-devel
 +
* /etc/c3.conf:
 +
cluster al-salam {
 +
    as0.cluster.earlham.edu:as0.al-salam.loc
 +
    as[1-12]
 +
}
 +
* in hopper:/etc/rc.conf:
 +
static_routes="bs0 as0"
 +
route_as0="192.168.1.1 159.28.234.150"
 +
* hopper:/etc/namedb/master/cluster.zone:
 +
as0.cluster.earlham.edu.      IN  A 159.28.234.150
 +
as.cluster.earlham.edu.      IN  CNAME as0
 +
al-salam.cluster.earlham.edu. IN  CNAME as0
 +
* hopper: /etc/namedb/named.conf
 +
<pre>
 +
acl al-salam {
 +
        192.168.1.0/24; // Al-Salam internal network
 +
        159.28.234.150; // Al-Salam headnode
 +
};
 +
 +
view al-salam {
 +
        match-clients { al-salam; };
 +
 +
        zone "al-salam.loc" {
 +
                type master;
 +
                allow-transfer { none; };
 +
                file "master/al-salam.loc";
 +
        };
 +
 +
        zone "1.168.192.in-addr.arpa" {
 +
                type master;
 +
                allow-transfer { none; };
 +
                file "master/1.168.192.in-addr.arpa";
 +
        };
 +
        zone "cluster.earlham.edu" {
 +
                type master;
 +
                allow-transfer { servers; };
 +
                file "master/cluster.zone";
 +
        };
 +
        zone "234.28.159.IN-ADDR.ARPA" {
 +
                type master;
 +
                allow-transfer { servers; };
 +
                file "master/159.28.234.zone";
 +
        };
 +
 +
        zone "." {
 +
                type hint;
 +
                file "master/named.root";
 +
        };
 +
};
 +
</pre>
 +
* hopper:/etc/namedb/master/al-salam.loc
 +
** copy from bobsced.loc, amend as necessary
 +
* hopper:/etc/namedb/master/1.168.192.in-addr.arpa
 +
** copy from 0.168.192.in-addr.arpa, amend as necessary
 +
* hopper:/etc/namedb/master/159.28.234.zone
 +
150 IN  PTR as0.cluster.earlham.edu.
 +
* hopper:/usr/local/etc/dhcpd.conf
 +
<pre>
 +
        subnet 192.168.1.0 netmask 255.255.255.0 {
 +
 +
                option routers                  192.168.1.1;
 +
                option subnet-mask              255.255.255.0;
 +
                option domain-name              "al-salam.loc";
 +
                option domain-name-servers      159.28.234.1;
 +
 +
                next-server                    159.28.234.17;
 +
                filename "pxelinux.0";
 +
 +
                host as1.al-salam.loc { hardware ethernet 00:30:48:F2:99:DC; fixed-address 192.168.1.101; }
 +
                host as2.al-salam.loc { hardware ethernet 00:30:48:F3:0D:32; fixed-address 192.168.1.102; }
 +
                host as3.al-salam.loc { hardware ethernet 00:30:48:F2:99:DA; fixed-address 192.168.1.103; }
 +
                host as4.al-salam.loc { hardware ethernet 00:30:48:F2:99:CC; fixed-address 192.168.1.104; }
 +
                host as5.al-salam.loc { hardware ethernet 00:30:48:F2:99:C4; fixed-address 192.168.1.105; }
 +
                host as6.al-salam.loc { hardware ethernet 00:30:48:F2:9A:06; fixed-address 192.168.1.106; }
 +
                host as7.al-salam.loc { hardware ethernet 00:30:48:F3:0D:30; fixed-address 192.168.1.107; }
 +
                host as8.al-salam.loc { hardware ethernet 00:30:48:F2:99:D6; fixed-address 192.168.1.108; }
 +
                host as9.al-salam.loc { hardware ethernet 00:30:48:F2:99:C6; fixed-address 192.168.1.109; }
 +
                host as10.al-salam.loc { hardware ethernet 00:30:48:F2:9A:0A; fixed-address 192.168.1.110; }
 +
                host as11.al-salam.loc { hardware ethernet 00:30:48:F2:99:E0; fixed-address 192.168.1.111; }
 +
                host as12.al-salam.loc { hardware ethernet 00:30:48:F2:99:A2; fixed-address 192.168.1.112; }
 +
</pre>
 +
* as0:/etc/dhcrelay
 +
# Command line options here
 +
INTERFACES="eth0 eth1"  # on layout both interfaces are required, originally only one was listed here
 +
DHCPSERVERS="cluster.earlham.edu"
 +
 +
* The extra bits on layout (all as root)
 +
$ yum install -y dhcp
 +
$ /etc/sysconfig/dhcrelay
 +
INTERFACES="eth0 eth1"
 +
DHCPSERVERS="cluster.earlham.edu"
 +
$ iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
 +
$ service iptables save
 +
 +
* hopper:
 +
  vi exports # add entries (check to make sure they aren't already covered by existing rules)
 +
  vi hosts.allow # add entries
=== compute nodes ===
=== compute nodes ===
Line 301: Line 420:
* Price Tag: $32,910.56
* Price Tag: $32,910.56
-
==Newegg Quote #2==
+
==Newegg Quote #2 (the one we purchased?) ==
 +
 
* 2x [http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=8560589 Newegg list]
* 2x [http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=8560589 Newegg list]
** 1U
** 1U

Latest revision as of 19:38, 15 December 2013

Al-Salam is the working name for the Earlham Computer Science Department's upcoming cluster computer.

Contents

Installation Notes

headnode

I'll be maintaining a script, /root/install/al-salam.sh, that will also serve as a log. Also following along with BobSCEd-new logs for consistency between clusters.

TODO

Have done

gcc.x86_64 gcc-c++.x86_64 gcc-gfortran.x86_64 \
gcc44.x86_64 gcc44-c++.x86_64 gcc44-gfortran.x86_64 \
apr-x86_64 apr-devel.x86_64 expat-devel.x86_64 \
blas.x86_64 dhcp.x86_64
cluster al-salam {
    as0.cluster.earlham.edu:as0.al-salam.loc
    as[1-12]
}
static_routes="bs0 as0"
route_as0="192.168.1.1 159.28.234.150"
as0.cluster.earlham.edu.      IN  A 159.28.234.150
as.cluster.earlham.edu.       IN  CNAME as0
al-salam.cluster.earlham.edu. IN  CNAME as0
 acl al-salam {
        192.168.1.0/24; // Al-Salam internal network
        159.28.234.150; // Al-Salam headnode
 };

view al-salam {
        match-clients { al-salam; };

        zone "al-salam.loc" {
                type master;
                allow-transfer { none; };
                file "master/al-salam.loc";
        };

        zone "1.168.192.in-addr.arpa" {
                type master;
                allow-transfer { none; };
                file "master/1.168.192.in-addr.arpa";
        };
        zone "cluster.earlham.edu" {
                type master;
                allow-transfer { servers; };
                file "master/cluster.zone";
        };
        zone "234.28.159.IN-ADDR.ARPA" {
                type master;
                allow-transfer { servers; };
                file "master/159.28.234.zone";
        };

        zone "." {
                type hint;
                file "master/named.root";
        };
 };
150 IN  PTR as0.cluster.earlham.edu.
        subnet 192.168.1.0 netmask 255.255.255.0 {

                option routers                  192.168.1.1;
                option subnet-mask              255.255.255.0;
                option domain-name              "al-salam.loc";
                option domain-name-servers      159.28.234.1;

                next-server                     159.28.234.17;
                filename "pxelinux.0";

                host as1.al-salam.loc { hardware ethernet 00:30:48:F2:99:DC; fixed-address 192.168.1.101; }
                host as2.al-salam.loc { hardware ethernet 00:30:48:F3:0D:32; fixed-address 192.168.1.102; }
                host as3.al-salam.loc { hardware ethernet 00:30:48:F2:99:DA; fixed-address 192.168.1.103; }
                host as4.al-salam.loc { hardware ethernet 00:30:48:F2:99:CC; fixed-address 192.168.1.104; }
                host as5.al-salam.loc { hardware ethernet 00:30:48:F2:99:C4; fixed-address 192.168.1.105; }
                host as6.al-salam.loc { hardware ethernet 00:30:48:F2:9A:06; fixed-address 192.168.1.106; }
                host as7.al-salam.loc { hardware ethernet 00:30:48:F3:0D:30; fixed-address 192.168.1.107; }
                host as8.al-salam.loc { hardware ethernet 00:30:48:F2:99:D6; fixed-address 192.168.1.108; }
                host as9.al-salam.loc { hardware ethernet 00:30:48:F2:99:C6; fixed-address 192.168.1.109; }
                host as10.al-salam.loc { hardware ethernet 00:30:48:F2:9A:0A; fixed-address 192.168.1.110; }
                host as11.al-salam.loc { hardware ethernet 00:30:48:F2:99:E0; fixed-address 192.168.1.111; }
                host as12.al-salam.loc { hardware ethernet 00:30:48:F2:99:A2; fixed-address 192.168.1.112; }
# Command line options here
INTERFACES="eth0 eth1"   # on layout both interfaces are required, originally only one was listed here
DHCPSERVERS="cluster.earlham.edu"
$ yum install -y dhcp
$ /etc/sysconfig/dhcrelay
INTERFACES="eth0 eth1"
DHCPSERVERS="cluster.earlham.edu"
$ iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
$ service iptables save
 vi exports # add entries (check to make sure they aren't already covered by existing rules)
 vi hosts.allow # add entries

compute nodes

Latest Overarching Questions

Parts List

  1. Nodes - case, motherboard(s), power supply, CPU, RAM, GPGPU cards
  2. Switch - managed, cut-through
    1. Fitz: Having a hard time finding anyone who sells cut-through switches
      1. How about this store-and-forward switch from hp?
  3. Power distribution - rack-mount PDUs

Tentative Specifications

Budget

Nodes

Specialty Nodes

Educationally, we could expect to get significant use out of GPGPUs, but the production use is limited. Increasing the variance of the architecture landscape would be a bonus to education.

Network

Disk

OS

Quick breakdown

Nodes

ION #61116 ION #61164 SM #174536 Newegg #1 Newegg #2 Intel List #1 AMD List #1 AMD List #2
CPU 72 2.4GHz Intel E5530 80 2.4GHz Intel E5530 80 2.4GHz Intel E5530 128 2.4GHz Intel E5530 112 2.4GHz Intel E5530 100 2.4GHz Intel E5530 156 2.0GHz AMD Opteron 2350 126 2.6GHz AMD Opteron 2435
RAM 108GB PC3-10600 120GB PC3-10600 120GB DDR3-1333 192GB DDR3-1333 168GB DDR3-1333 144GB DDR3-1333 160GB DDR2-800 120GB DDR2-800
GPU 2 Tesla C1060 2 Tesla C1060 2 Tesla C1060 None 4 Tesla C1060 2 Tesla C1060 2 Tesla C1060 2 Tesla C1060
Local disk Yes Yes Yes Yes Yes Yes Yes Yes
Shared chassis No Yes Yes No No No No No
Remote mgmt No No IPMI No IPMI on GPU nodes IPMI IPMI IPMI
Size (just nodes) 9U 6U 6U 16U 12U 12U 20U 10U
Price $33,173.20 $33,054.30 $30,078.00 $32,910.56 $34,696.78 $35,846.00 $35,275.00 $33,755.00

Power distribution

PDU1220 PDUMH20 AP9563 AP7801
Vendor TrippLite TrippLite APC APC
Size 1U 1U 1U 1U
Capabilities Dumb Metered Dumb Metered
Input power 20A, 1x NEMA 5-20P 20A, 1x NEMA L5-20P w/ NEMA 5-20P adapter 20A, 1x NEMA 5-20P 20A, 1x NEMA 5-20P
Output power 13x NEMA 5-20R 12x NEMA 5-20R 10x NEMA 5-20R 8x NEMA 5-20R
Price $195 $230 $120 $380

ION Computer Systems Quotation #61116

ION Computer Systems Quotation #61164

Silicon Mechanics Quote #174536

Newegg Quote #1

Newegg Quote #2 (the one we purchased?)

Intel List #1

AMD List #1

AMD List #2

Personal tools
Namespaces
Variants
Actions
websites
wiki
this semester
Toolbox