Al-salam

From Earlham Cluster Department

(Difference between revisions)
Jump to: navigation, search
(Have done)
 
(71 intermediate revisions not shown)
Line 1: Line 1:
-
Damascus is the working name for the Earlham Computer Science Department's upcoming cluster computer.
+
Al-Salam is the working name for the Earlham Computer Science Department's upcoming cluster computer.
-
At the moment Damascus exists only as a $40,000 grant and a growing list of tentative specifications:
+
== Installation Notes ==
 +
=== headnode ===
 +
I'll be maintaining a script, <tt>/root/install/al-salam.sh</tt>, that will also serve as a log. Also following along with [[Cluster: New BobSCEd Install Log#Head Node|BobSCEd-new logs]] for consistency between clusters.
 +
 
 +
==== TODO ====
 +
* MPI
 +
* Software installations into /cluster/al-salam
 +
* User auth via bs0-new's ldap
 +
* torque/maui
 +
* Ganglia
 +
* shorewall
 +
* modules
 +
 
 +
==== Have done ====
 +
 
 +
* yum install:
 +
gcc.x86_64 gcc-c++.x86_64 gcc-gfortran.x86_64 \
 +
gcc44.x86_64 gcc44-c++.x86_64 gcc44-gfortran.x86_64 \
 +
apr-x86_64 apr-devel.x86_64 expat-devel.x86_64 \
 +
blas.x86_64 dhcp.x86_64
 +
* rpm install:
 +
** c3
 +
** libconfuse
 +
** libconfuse-devel
 +
* /etc/c3.conf:
 +
cluster al-salam {
 +
    as0.cluster.earlham.edu:as0.al-salam.loc
 +
    as[1-12]
 +
}
 +
* in hopper:/etc/rc.conf:
 +
static_routes="bs0 as0"
 +
route_as0="192.168.1.1 159.28.234.150"
 +
* hopper:/etc/namedb/master/cluster.zone:
 +
as0.cluster.earlham.edu.      IN  A 159.28.234.150
 +
as.cluster.earlham.edu.      IN  CNAME as0
 +
al-salam.cluster.earlham.edu. IN  CNAME as0
 +
* hopper: /etc/namedb/named.conf
 +
<pre>
 +
acl al-salam {
 +
        192.168.1.0/24; // Al-Salam internal network
 +
        159.28.234.150; // Al-Salam headnode
 +
};
 +
 
 +
view al-salam {
 +
        match-clients { al-salam; };
 +
 
 +
        zone "al-salam.loc" {
 +
                type master;
 +
                allow-transfer { none; };
 +
                file "master/al-salam.loc";
 +
        };
 +
 
 +
        zone "1.168.192.in-addr.arpa" {
 +
                type master;
 +
                allow-transfer { none; };
 +
                file "master/1.168.192.in-addr.arpa";
 +
        };
 +
        zone "cluster.earlham.edu" {
 +
                type master;
 +
                allow-transfer { servers; };
 +
                file "master/cluster.zone";
 +
        };
 +
        zone "234.28.159.IN-ADDR.ARPA" {
 +
                type master;
 +
                allow-transfer { servers; };
 +
                file "master/159.28.234.zone";
 +
        };
 +
 
 +
        zone "." {
 +
                type hint;
 +
                file "master/named.root";
 +
        };
 +
};
 +
</pre>
 +
* hopper:/etc/namedb/master/al-salam.loc
 +
** copy from bobsced.loc, amend as necessary
 +
* hopper:/etc/namedb/master/1.168.192.in-addr.arpa
 +
** copy from 0.168.192.in-addr.arpa, amend as necessary
 +
* hopper:/etc/namedb/master/159.28.234.zone
 +
150 IN  PTR as0.cluster.earlham.edu.
 +
* hopper:/usr/local/etc/dhcpd.conf
 +
<pre>
 +
        subnet 192.168.1.0 netmask 255.255.255.0 {
 +
 
 +
                option routers                  192.168.1.1;
 +
                option subnet-mask              255.255.255.0;
 +
                option domain-name              "al-salam.loc";
 +
                option domain-name-servers      159.28.234.1;
 +
 
 +
                next-server                    159.28.234.17;
 +
                filename "pxelinux.0";
 +
 
 +
                host as1.al-salam.loc { hardware ethernet 00:30:48:F2:99:DC; fixed-address 192.168.1.101; }
 +
                host as2.al-salam.loc { hardware ethernet 00:30:48:F3:0D:32; fixed-address 192.168.1.102; }
 +
                host as3.al-salam.loc { hardware ethernet 00:30:48:F2:99:DA; fixed-address 192.168.1.103; }
 +
                host as4.al-salam.loc { hardware ethernet 00:30:48:F2:99:CC; fixed-address 192.168.1.104; }
 +
                host as5.al-salam.loc { hardware ethernet 00:30:48:F2:99:C4; fixed-address 192.168.1.105; }
 +
                host as6.al-salam.loc { hardware ethernet 00:30:48:F2:9A:06; fixed-address 192.168.1.106; }
 +
                host as7.al-salam.loc { hardware ethernet 00:30:48:F3:0D:30; fixed-address 192.168.1.107; }
 +
                host as8.al-salam.loc { hardware ethernet 00:30:48:F2:99:D6; fixed-address 192.168.1.108; }
 +
                host as9.al-salam.loc { hardware ethernet 00:30:48:F2:99:C6; fixed-address 192.168.1.109; }
 +
                host as10.al-salam.loc { hardware ethernet 00:30:48:F2:9A:0A; fixed-address 192.168.1.110; }
 +
                host as11.al-salam.loc { hardware ethernet 00:30:48:F2:99:E0; fixed-address 192.168.1.111; }
 +
                host as12.al-salam.loc { hardware ethernet 00:30:48:F2:99:A2; fixed-address 192.168.1.112; }
 +
</pre>
 +
* as0:/etc/dhcrelay
 +
# Command line options here
 +
INTERFACES="eth0 eth1"  # on layout both interfaces are required, originally only one was listed here
 +
DHCPSERVERS="cluster.earlham.edu"
 +
 
 +
* The extra bits on layout (all as root)
 +
$ yum install -y dhcp
 +
$ /etc/sysconfig/dhcrelay
 +
INTERFACES="eth0 eth1"
 +
DHCPSERVERS="cluster.earlham.edu"
 +
$ iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
 +
$ service iptables save
 +
 
 +
* hopper:
 +
  vi exports # add entries (check to make sure they aren't already covered by existing rules)
 +
  vi hosts.allow # add entries
 +
 
 +
=== compute nodes ===
 +
* clone from bs1-new using udpcast
 +
* modify network, etc. settings from ''bobsced'' to ''al-salam''
 +
 
 +
== Latest Overarching Questions ==
 +
*Should we build this machine ourselves?
 +
*#Are we wasting our money and learning opportunity letting them do the building for us?
 +
*#If it is cheaper, Would it be a useful experience for the students this coming semester to take a large collection of hardware and make it into a cluster?
 +
*# Yes, this was pretty clear from the email thread in early December.
 +
*How much if any GPGPU hardware do we want?  0, 1 or 2 nodes worth?
 +
*Do we want a high bandwidth/low latency network?
 +
**We do not. More expensive than it is worth.
 +
*What software stack do we want to run?  Vendor supplied or the BCCD?
 +
**Both. Vendor-supplied base with a BCCD virtual machine
 +
***Will the virtual machine support CUDA?
 +
* Do compute nodes have spinning disk?
 +
** Compute nodes have a spinning disk. Solid state is still expensive
 +
* What's on the local persistent store?  /tmp? An entire OS?
 +
* Support
 +
** Consider getting the cheapest hardware support, loosing a node isn't critical as long as they send a replacement quickly.
 +
 
 +
== Parts List ==
 +
# Nodes - case, motherboard(s), power supply, CPU, RAM, GPGPU cards
 +
# Switch - managed, cut-through
 +
## Fitz: Having a hard time finding anyone who sells cut-through switches
 +
### How about [http://www.tigerdirect.com/applications/searchtools/item-Details.asp?sku=H24-J9021A&SRCCODE=CHANNELINC&cisrccode=cii_7240393&cpncode=20-3634289 this] store-and-forward switch from hp?
 +
# Power distribution - rack-mount PDUs
== Tentative Specifications ==
== Tentative Specifications ==
Line 16: Line 164:
=== Specialty Nodes ===
=== Specialty Nodes ===
* Two nodes should support CUDA GPGPU
* Two nodes should support CUDA GPGPU
 +
 +
Educationally, we could expect to get significant use out of GPGPUs, but the production use is limited.
 +
Increasing the variance of the architecture landscape would be a bonus to education.
=== Network ===
=== Network ===
-
* One of the following (in order of cost):
+
* Gigabit Ethernet fabric with switch
-
** Gigabit Ethernet fabric with switch
+
 
-
** 10 Gigabit Ethernet fabric
+
=== Disk ===
-
** Infiniband
+
* Spinning Disk
 +
 
 +
=== OS ===
 +
* Virtual BCCD on top of built-in OS.
 +
 
 +
== Quick breakdown ==
 +
 
 +
=== Nodes ===
 +
 
 +
{| class="wikitable" border="1"
 +
!
 +
! [[Al-salam#ION_Computer_Systems_Quotation_.2361116|ION #61116]]
 +
! [[Al-salam#ION_Computer_Systems_Quotation_.2361164|ION #61164]]
 +
! [[Al-salam#Silicon_Mechanics_Quote_.23174536|SM #174536]]
 +
! [[Al-salam#Newegg_Quote_.231|Newegg #1]]
 +
! [[Al-salam#Newegg_Quote_.232|Newegg #2]]
 +
! [[Al-salam#Intel_List_.231|Intel List #1]]
 +
! [[Al-salam#AMD_List_.231|AMD List #1]]
 +
! [[Al-salam#AMD_List_.232|AMD List #2]]
 +
|-
 +
| '''CPU'''
 +
| 72 2.4GHz Intel E5530
 +
| 80 2.4GHz Intel E5530
 +
| 80 2.4GHz Intel E5530
 +
| 128 2.4GHz Intel E5530
 +
| 112 2.4GHz Intel E5530
 +
| 100 2.4GHz Intel E5530
 +
| 156 2.0GHz [http://www.newegg.com/Product/Product.aspx?Item=N82E16819105189 AMD Opteron 2350]
 +
| 126 2.6GHz [http://www.newegg.com/Product/Product.aspx?Item=N82E16819105189 AMD Opteron 2435]
 +
|-
 +
| '''RAM'''
 +
| 108GB PC3-10600
 +
| 120GB PC3-10600
 +
| 120GB DDR3-1333
 +
| 192GB DDR3-1333
 +
| 168GB DDR3-1333
 +
| 144GB DDR3-1333
 +
| 160GB DDR2-800
 +
| 120GB DDR2-800
 +
|-
 +
| '''GPU'''
 +
| 2 Tesla C1060
 +
| 2 Tesla C1060
 +
| 2 Tesla C1060
 +
| None
 +
| 4 Tesla C1060
 +
| 2 Tesla C1060
 +
| 2 Tesla C1060
 +
| 2 Tesla C1060
 +
|-
 +
| '''Local disk'''
 +
| Yes
 +
| Yes
 +
| Yes
 +
| Yes
 +
| Yes
 +
| Yes
 +
| Yes
 +
| Yes
 +
|-
 +
| '''Shared chassis'''
 +
| No
 +
| Yes
 +
| Yes
 +
| No
 +
| No
 +
| No
 +
| No
 +
| No
 +
|-
 +
| '''Remote mgmt'''
 +
| No
 +
| No
 +
| IPMI
 +
| No
 +
| IPMI on GPU nodes
 +
| IPMI
 +
| IPMI
 +
| IPMI
 +
|-
 +
| '''Size (just nodes)'''
 +
| 9U
 +
| 6U
 +
| 6U
 +
| 16U
 +
| 12U
 +
| 12U
 +
| 20U
 +
| 10U
 +
|-
 +
| '''Price'''
 +
| $33,173.20
 +
| $33,054.30
 +
| $30,078.00
 +
| $32,910.56
 +
| $34,696.78
 +
| $35,846.00
 +
| $35,275.00
 +
| $33,755.00
 +
|}
 +
 
 +
=== Power distribution ===
 +
 
 +
{| class="wikitable" border="1"
 +
!
 +
! [http://www.tripplite.com/en/products/model.cfm?txtSeriesID=446&EID=14295&txtModelID=2005 PDU1220]
 +
! [http://www.tripplite.com/en/products/model.cfm?txtSeriesID=513&EID=77300&txtModelID=3867 PDUMH20]
 +
! [http://accessories.us.dell.com/sna/products/Server_Network/productdetail.aspx?c=us&l=en&cs=04&sku=A0151958 AP9563]
 +
! [http://accessories.us.dell.com/sna/products/Server_Network/productdetail.aspx?c=us&l=en&s=bsd&cs=04&sku=A0748689 AP7801]
 +
|-
 +
| '''Vendor'''
 +
| TrippLite
 +
| TrippLite
 +
| APC
 +
| APC
 +
|-
 +
| '''Size'''
 +
| 1U
 +
| 1U
 +
| 1U
 +
| 1U
 +
|-
 +
| '''Capabilities'''
 +
| Dumb
 +
| Metered
 +
| Dumb
 +
| Metered
 +
|-
 +
| '''Input power'''
 +
| 20A, 1x NEMA 5-20P
 +
| 20A, 1x NEMA L5-20P w/ NEMA 5-20P adapter
 +
| 20A, 1x NEMA 5-20P
 +
| 20A, 1x NEMA 5-20P
 +
|-
 +
| '''Output power'''
 +
| 13x NEMA 5-20R
 +
| 12x NEMA 5-20R
 +
| 10x NEMA 5-20R
 +
| 8x NEMA 5-20R
 +
|-
 +
| '''Price'''
 +
| $195
 +
| $230
 +
| $120
 +
| $380
 +
|}
== ION Computer Systems Quotation #61116  ==
== ION Computer Systems Quotation #61116  ==
Line 40: Line 336:
** Seagate SV35.3 250GB, 7200RPM, SATA 3Gb for SDVR 3.5“ Disk
** Seagate SV35.3 250GB, 7200RPM, SATA 3Gb for SDVR 3.5“ Disk
** Dual Intel Gigabit Server NICs with IOAT2 Integrated
** Dual Intel Gigabit Server NICs with IOAT2 Integrated
 +
 +
* Networking Fabric
 +
**Network not included
* Other stuff
* Other stuff
Line 50: Line 349:
== ION Computer Systems Quotation #61164  ==
== ION Computer Systems Quotation #61164  ==
-
* ION G10 Server with GPU: $4,972.00 each
+
* 2 ION G10 Server with GPU: $4,972.00 each
** (2) Intel® Quad-Core Xeon® processor E5530 (2.40GHz, 8MB Cache, 5.86GT/s, 80W)
** (2) Intel® Quad-Core Xeon® processor E5530 (2.40GHz, 8MB Cache, 5.86GT/s, 80W)
** 12GB RAM [Bank 1 of 2: (6) 2GB ECC PC3-10600 1333MHz 2rank DDR3 RDIMM Modules][Smart]
** 12GB RAM [Bank 1 of 2: (6) 2GB ECC PC3-10600 1333MHz 2rank DDR3 RDIMM Modules][Smart]
Line 59: Line 358:
** Dual Intel Gigabit Server NICs with IOAT2 Integrated
** Dual Intel Gigabit Server NICs with IOAT2 Integrated
-
* ION T11 DualNode: $6,477.0 each
+
* 4 ION T11 DualNode: $6,477.0 each
** (2x2) Intel® Quad-Core Xeon® processor E5530 (2.40GHz, 8MB Cache, 5.86GT/s, 80W)
** (2x2) Intel® Quad-Core Xeon® processor E5530 (2.40GHz, 8MB Cache, 5.86GT/s, 80W)
** Total memory: 12GB DDR3_1333 per node
** Total memory: 12GB DDR3_1333 per node
Line 67: Line 366:
** Dual Intel Gigabit Server NICs with IOAT2 Integrated
** Dual Intel Gigabit Server NICs with IOAT2 Integrated
** These nodes are modular. One can be unplugged and worked on while the others remain running.
** These nodes are modular. One can be unplugged and worked on while the others remain running.
 +
 +
* Network
 +
** Network Not Included
* Other stuff
* Other stuff
Line 76: Line 378:
==Silicon Mechanics Quote #174536==
==Silicon Mechanics Quote #174536==
-
*Rackform iServ R4410:  $11043.00 each
+
* 2x Rackform iServ R4410:  $11043.00 each ($10601.00 each with education) [http://www.siliconmechanics.com/quotes/174536?confirmation=879366549 link]
**Shared Chassis: The following chassis resources are shared by all 4 compute nodes
**Shared Chassis: The following chassis resources are shared by all 4 compute nodes
**External Optical Drive: No Item Selected
**External Optical Drive: No Item Selected
Line 88: Line 390:
***Hot-Swap Drive - 1: 250GB Western Digital RE3 (3.0Gb/s, 7.2Krpm, 16MB Cache) SATA
***Hot-Swap Drive - 1: 250GB Western Digital RE3 (3.0Gb/s, 7.2Krpm, 16MB Cache) SATA
-
*Rackform iServ R350-GPU: $5196.00 each
+
* 2x Rackform iServ R350-GPU: $5196.00 each ($4433.00 each with education) [http://www.siliconmechanics.com/quotes/174542?confirmation=712641984 link]
**CPU: 2 x Intel Xeon E5530 Quad-Core 2.40GHz, 8MB Cache, 5.86GT/s QPI
**CPU: 2 x Intel Xeon E5530 Quad-Core 2.40GHz, 8MB Cache, 5.86GT/s QPI
**RAM: 12GB (6 x 2GB) Operating at 1333MHz Max (DDR3-1333 ECC Unbuffered DIMMs)
**RAM: 12GB (6 x 2GB) Operating at 1333MHz Max (DDR3-1333 ECC Unbuffered DIMMs)
Line 98: Line 400:
**Power Supply: 1400W Power Supply with PMBus - 80 PLUS Gold Certified
**Power Supply: 1400W Power Supply with PMBus - 80 PLUS Gold Certified
**Rail Kit: 1U Rail Kit
**Rail Kit: 1U Rail Kit
 +
 +
*Price Tag: $32,478 ($30,078 with education)
 +
 +
*Questions
 +
**Can we lose the hot-swappability to save money?
 +
**Do we need to get a Gig-Switch?
 +
***Would Cairo do?
 +
 +
==Newegg Quote #1==
 +
* 16x [http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=16958187 Newegg list]
 +
** 1U [http://www.newegg.com/Product/Product.aspx?item=N82E16811116011 link]
 +
** 2x Intel Xeon (Nehalem) E5530, Quad-Core, 2.4GHz, 80 Watt [http://www.newegg.com/Product/Product.aspx?item=N82E16819117184 link]
 +
** Slim CD/DVD Drive
 +
** 4x Gigabit ethernet [http://www.newegg.com/Product/Product.aspx?item=N82E16813151195R motherboard]
 +
** 500W non-redundant power supply
 +
** 160GB 7200RPM Seagate [http://www.newegg.com/Product/Product.aspx?item=N82E16822148511 link]
 +
** 12GB RAM (240-pin DDR3 1333 ECC, unbuffered)
 +
 +
* Price Tag: $32,910.56
 +
 +
==Newegg Quote #2 (the one we purchased?) ==
 +
 +
* 2x [http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=8560589 Newegg list]
 +
** 1U
 +
** 2x Intel Xeon (Nehalem) E5530, Quad-Core, 2.4GHz, 80 Watt
 +
** 2x Gigabit ethernet
 +
** IPMI
 +
** 1400W non-redundant power supply
 +
** 2x C1060 Tesla
 +
** 160GB 7200RPM Seagate
 +
** 12GB RAM (240-pin DDR3 1333 ECC, unbuffered)
 +
 +
* 12x [http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=16958187 Newegg list]
 +
** 1U [http://www.newegg.com/Product/Product.aspx?item=N82E16811116011 link]
 +
** 2x Intel Xeon (Nehalem) E5530, Quad-Core, 2.4GHz, 80 Watt
 +
** Slim CD/DVD Drive
 +
** 4x Gigabit ethernet
 +
** 500W non-redundant power supply
 +
** 160GB 7200RPM Seagate
 +
** 12GB RAM (240-pin DDR3 1333 ECC, unbuffered)
 +
 +
* Price tag: $34,696.78
 +
 +
==Intel List #1==
 +
* 13x Chassis + mainboard: http://www.provantage.com/supermicro-sys-6016t-gtf~7SUP91FA.htm
 +
* 25x CPU: http://www.newegg.com/Product/Product.aspx?Item=N82E16819117184
 +
* 39x RAM: http://www.newegg.com/Product/Product.aspx?Item=N82E16820139041
 +
* 14x HDD: http://www.newegg.com/Product/Product.aspx?Item=N82E16822136280
 +
* 2x Tesla card: http://www.tigerdirect.com/applications/searchtools/item-details.asp?EdpNo=4259469&SRCCODE=GOOGLEBASE&cm_mmc_o=VRqCjC7BBTkwCjCECjCE
 +
* Notes:
 +
** ~$2600/node without Tesla
 +
** ~$3850/node with Tesla
 +
** 1.5GB RAM/core
 +
** 2 dies/node (8 cores/node)
 +
** Yes IPMI
 +
** 1 headnode (compute node - one die + one HDD) + 11 compute nodes + 2 Tesla nodes ~= $35,846
 +
 +
==AMD List #1==
 +
* 20x Chassis: http://www.newegg.com/Product/Product.aspx?Item=N82E16811152128
 +
* 20x Mainboard: http://www.newegg.com/Product/Product.aspx?Item=N82E16813182108
 +
* 39x CPU: http://www.newegg.com/Product/Product.aspx?Item=N82E16819105189
 +
* 40x RAM: http://www.newegg.com/Product/Product.aspx?Item=N82E16820134936
 +
* 21x HDD: http://www.newegg.com/Product/Product.aspx?Item=N82E16822136280
 +
* 2x Tesla: http://www.tigerdirect.com/applications/searchtools/item-details.asp?EdpNo=4259469&SRCCODE=GOOGLEBASE&cm_mmc_o=VRqCjC7BBTkwCjCECjCE
 +
* Notes:
 +
** $1650/node without Tesla
 +
** $2900/node with Tesla
 +
** 1G RAM/core
 +
** 2 dies/node (8 cores/node)
 +
** Yes IMPI
 +
** 1 headnode (compute node - one die + one HDD) + 17 compute nodes + 2 Tesla nodes ~= $35,275
 +
 +
==AMD List #2==
 +
* 10x Chassis: http://www.newegg.com/Product/Product.aspx?Item=N82E16811152128
 +
* 10x Mainboard: http://www.newegg.com/Product/Product.aspx?Item=N82E16813182108
 +
* 19x CPU: http://www.newegg.com/Product/Product.aspx?Item=N82E16819105189
 +
* 30x RAM: http://www.newegg.com/Product/Product.aspx?Item=N82E16820134936
 +
* 11x HDD: http://www.newegg.com/Product/Product.aspx?Item=N82E16822136280
 +
* 2x Tesla: http://www.tigerdirect.com/applications/searchtools/item-details.asp?EdpNo=4259469&SRCCODE=GOOGLEBASE&cm_mmc_o=VRqCjC7BBTkwCjCECjCE
 +
* Notes:
 +
** $3130/node without Tesla
 +
** $4350/node with Tesla
 +
** 1G RAM/core
 +
** 2 dies/node (12 cores/node)
 +
** Yes IMPI
 +
** 1 headnode (compute node - one die + one HDD) + 7 compute nodes + 2 Tesla nodes ~= $33,755

Latest revision as of 19:38, 15 December 2013

Al-Salam is the working name for the Earlham Computer Science Department's upcoming cluster computer.

Contents

Installation Notes

headnode

I'll be maintaining a script, /root/install/al-salam.sh, that will also serve as a log. Also following along with BobSCEd-new logs for consistency between clusters.

TODO

Have done

gcc.x86_64 gcc-c++.x86_64 gcc-gfortran.x86_64 \
gcc44.x86_64 gcc44-c++.x86_64 gcc44-gfortran.x86_64 \
apr-x86_64 apr-devel.x86_64 expat-devel.x86_64 \
blas.x86_64 dhcp.x86_64
cluster al-salam {
    as0.cluster.earlham.edu:as0.al-salam.loc
    as[1-12]
}
static_routes="bs0 as0"
route_as0="192.168.1.1 159.28.234.150"
as0.cluster.earlham.edu.      IN  A 159.28.234.150
as.cluster.earlham.edu.       IN  CNAME as0
al-salam.cluster.earlham.edu. IN  CNAME as0
 acl al-salam {
        192.168.1.0/24; // Al-Salam internal network
        159.28.234.150; // Al-Salam headnode
 };

view al-salam {
        match-clients { al-salam; };

        zone "al-salam.loc" {
                type master;
                allow-transfer { none; };
                file "master/al-salam.loc";
        };

        zone "1.168.192.in-addr.arpa" {
                type master;
                allow-transfer { none; };
                file "master/1.168.192.in-addr.arpa";
        };
        zone "cluster.earlham.edu" {
                type master;
                allow-transfer { servers; };
                file "master/cluster.zone";
        };
        zone "234.28.159.IN-ADDR.ARPA" {
                type master;
                allow-transfer { servers; };
                file "master/159.28.234.zone";
        };

        zone "." {
                type hint;
                file "master/named.root";
        };
 };
150 IN  PTR as0.cluster.earlham.edu.
        subnet 192.168.1.0 netmask 255.255.255.0 {

                option routers                  192.168.1.1;
                option subnet-mask              255.255.255.0;
                option domain-name              "al-salam.loc";
                option domain-name-servers      159.28.234.1;

                next-server                     159.28.234.17;
                filename "pxelinux.0";

                host as1.al-salam.loc { hardware ethernet 00:30:48:F2:99:DC; fixed-address 192.168.1.101; }
                host as2.al-salam.loc { hardware ethernet 00:30:48:F3:0D:32; fixed-address 192.168.1.102; }
                host as3.al-salam.loc { hardware ethernet 00:30:48:F2:99:DA; fixed-address 192.168.1.103; }
                host as4.al-salam.loc { hardware ethernet 00:30:48:F2:99:CC; fixed-address 192.168.1.104; }
                host as5.al-salam.loc { hardware ethernet 00:30:48:F2:99:C4; fixed-address 192.168.1.105; }
                host as6.al-salam.loc { hardware ethernet 00:30:48:F2:9A:06; fixed-address 192.168.1.106; }
                host as7.al-salam.loc { hardware ethernet 00:30:48:F3:0D:30; fixed-address 192.168.1.107; }
                host as8.al-salam.loc { hardware ethernet 00:30:48:F2:99:D6; fixed-address 192.168.1.108; }
                host as9.al-salam.loc { hardware ethernet 00:30:48:F2:99:C6; fixed-address 192.168.1.109; }
                host as10.al-salam.loc { hardware ethernet 00:30:48:F2:9A:0A; fixed-address 192.168.1.110; }
                host as11.al-salam.loc { hardware ethernet 00:30:48:F2:99:E0; fixed-address 192.168.1.111; }
                host as12.al-salam.loc { hardware ethernet 00:30:48:F2:99:A2; fixed-address 192.168.1.112; }
# Command line options here
INTERFACES="eth0 eth1"   # on layout both interfaces are required, originally only one was listed here
DHCPSERVERS="cluster.earlham.edu"
$ yum install -y dhcp
$ /etc/sysconfig/dhcrelay
INTERFACES="eth0 eth1"
DHCPSERVERS="cluster.earlham.edu"
$ iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
$ service iptables save
 vi exports # add entries (check to make sure they aren't already covered by existing rules)
 vi hosts.allow # add entries

compute nodes

Latest Overarching Questions

Parts List

  1. Nodes - case, motherboard(s), power supply, CPU, RAM, GPGPU cards
  2. Switch - managed, cut-through
    1. Fitz: Having a hard time finding anyone who sells cut-through switches
      1. How about this store-and-forward switch from hp?
  3. Power distribution - rack-mount PDUs

Tentative Specifications

Budget

Nodes

Specialty Nodes

Educationally, we could expect to get significant use out of GPGPUs, but the production use is limited. Increasing the variance of the architecture landscape would be a bonus to education.

Network

Disk

OS

Quick breakdown

Nodes

ION #61116 ION #61164 SM #174536 Newegg #1 Newegg #2 Intel List #1 AMD List #1 AMD List #2
CPU 72 2.4GHz Intel E5530 80 2.4GHz Intel E5530 80 2.4GHz Intel E5530 128 2.4GHz Intel E5530 112 2.4GHz Intel E5530 100 2.4GHz Intel E5530 156 2.0GHz AMD Opteron 2350 126 2.6GHz AMD Opteron 2435
RAM 108GB PC3-10600 120GB PC3-10600 120GB DDR3-1333 192GB DDR3-1333 168GB DDR3-1333 144GB DDR3-1333 160GB DDR2-800 120GB DDR2-800
GPU 2 Tesla C1060 2 Tesla C1060 2 Tesla C1060 None 4 Tesla C1060 2 Tesla C1060 2 Tesla C1060 2 Tesla C1060
Local disk Yes Yes Yes Yes Yes Yes Yes Yes
Shared chassis No Yes Yes No No No No No
Remote mgmt No No IPMI No IPMI on GPU nodes IPMI IPMI IPMI
Size (just nodes) 9U 6U 6U 16U 12U 12U 20U 10U
Price $33,173.20 $33,054.30 $30,078.00 $32,910.56 $34,696.78 $35,846.00 $35,275.00 $33,755.00

Power distribution

PDU1220 PDUMH20 AP9563 AP7801
Vendor TrippLite TrippLite APC APC
Size 1U 1U 1U 1U
Capabilities Dumb Metered Dumb Metered
Input power 20A, 1x NEMA 5-20P 20A, 1x NEMA L5-20P w/ NEMA 5-20P adapter 20A, 1x NEMA 5-20P 20A, 1x NEMA 5-20P
Output power 13x NEMA 5-20R 12x NEMA 5-20R 10x NEMA 5-20R 8x NEMA 5-20R
Price $195 $230 $120 $380

ION Computer Systems Quotation #61116

ION Computer Systems Quotation #61164

Silicon Mechanics Quote #174536

Newegg Quote #1

Newegg Quote #2 (the one we purchased?)

Intel List #1

AMD List #1

AMD List #2

Personal tools
Namespaces
Variants
Actions
websites
wiki
this semester
Toolbox