Ccg-admin

From Earlham Cluster Department

(Difference between revisions)
Jump to: navigation, search
(Setting up LDAP)
(Disable graphical booting screen in CentOS)
 
(13 intermediate revisions not shown)
Line 14: Line 14:
== Building software packages ==
== Building software packages ==
-
* Download the source tarball into /root/install
+
* Download the source tarball into /root/
-
* Unpack, build with configure, etc.
+
* Unpack, build with ./configure, etc. Make sure you set the --prefix option to be --prefix=/mounts/[machine]/software/[package-name]/[package-version].
 +
 
 +
===Needs to be Updated===
* Make a <tt><package>-<version>.config.sh</tt> script that runs ./configure with all your options (so that it's kept around in case we need to reinstall).
* Make a <tt><package>-<version>.config.sh</tt> script that runs ./configure with all your options (so that it's kept around in case we need to reinstall).
-
* To configure, give <tt>--prefix=/mounts/al-salam/software/<package>-<version></tt>
 
-
* Run your config.sh and continue building/installing as normal
 
-
* Create a soft link from <tt>/mounts/al-salam/software/<package> to /mounts/al-salam/software/<package>-<version></tt>
 
== Installing a yum package into Modules structure ==
== Installing a yum package into Modules structure ==
Line 93: Line 92:
<pre class="text">
<pre class="text">
     passwd:  ldap files
     passwd:  ldap files
-
     groups:  ldap files
+
     group:  ldap files
     shadow: ldap files
     shadow: ldap files
Line 215: Line 214:
<code>rhgb</code> stands for redhat graphical boot, the <code>quiet</code> option tells CentOS to suppress even more boot information.
<code>rhgb</code> stands for redhat graphical boot, the <code>quiet</code> option tells CentOS to suppress even more boot information.
-
* https://www.redhat.com/archives/rhl-list/2004-May/msg07775.html
+
Grub2 works a little differently. There is no <code>grub.conf</code> file. Instead, it is generated from a bunch of files living inside <code>/etc/grup.d/</code>.
-
= Rebuilding and Creating RAID 1 Arrays with mdadm =
+
To disable graphical boot on grub2:
 +
1. backup default grub <pre>cp /etc/default/grub /etc/default/grub.bak</pre>
 +
2. Remove <code>rhgb quiet</code> from the line <code>GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet"</code> <pre>vim /etc/default/grub</pre>
 +
3. generate new grub config <pre>grub2-mkconfig --output=/boot/grub2/grub.cfg</pre>
-
== Crating Arrays ==
+
* https://www.redhat.com/archives/rhl-list/2004-May/msg07775.html
 +
* Grub2: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-working_with_the_grub_2_boot_loader
-
To create a mirrored array with two drives, <code>sda</code> and <code>sdb</code>, on partitions, <code>sda1</code> and <code>sdb1</code>:
+
= Rebuilding and Creating RAID arrays with mdadm =
-
<pre class="text">mdadm --create --verbose /dev/md0 --level=raid1 --raid-devices=2 /dev/sda1 /dev/sdb1
+
[[mdadm|mdadm RAID Documentation]]
-
</pre>
+
-
Now you can monitor the status of the building of the array with:
+
-
 
+
-
<pre class="text">cat /proc/mdstat
+
-
</pre>
+
-
Once finished, save your <code>mdadm</code> configuration with:
+
-
 
+
-
<pre class="text">mdadm --verbose --detail --scan &gt; /etc/mdadm.conf
+
-
</pre>
+
-
You may need to edit this file to remove unwanted lines or to add an email address to <code>MAILADDR</code> to be notified if a drive failure occurs:
+
-
 
+
-
<pre class="text">MAILADDR user1@dom1.com, user2@dom2.com
+
-
</pre>
+
-
On some systems, <code>mdadm</code>'s configuration file is <code>/etc/mdadm/mdadm.conf</code>, it is very important to put the configuration in the correct location.
+
-
 
+
-
== Rebuilding Arrays ==
+
-
 
+
-
If a drive ever fails, or is the system is booted with a drive removed, you will need to add it back into the array.
+
-
 
+
-
=== Failed Drive ===
+
-
 
+
-
In this example, <code>/dev/sda1</code> and <code>/dev/sdb1</code> make up the RAID 1 array <code>/dev/md0</code>. Let us say that <code>/dev/sdb</code> fails.
+
-
 
+
-
==== Determining failed drive ====
+
-
 
+
-
Run
+
-
 
+
-
<pre class="text">cat /proc/mdstat
+
-
</pre>
+
-
<pre class="text">[root@lo4 ~]# cat /proc/mdstat
+
-
Personalities : [raid1]
+
-
md0 : active raid1 sdb1[1](F) sda1[0]
+
-
      204736 blocks super 1.0 [2/1] [U_]
+
-
 
+
-
unused devices: &lt;none&gt;
+
-
</pre>
+
-
When a drive fails or is missing, you will see an underscore in the array output (<code>[U_]</code> instead of <code>[UU]</code>). <code>(F)</code> will be displayed next to the failed drive (<code>sdb1[1](F)</code>).<br /> If not, running <code>lsblk</code> or <code>fdisk -l</code> may help you determine which drive it is that failed
+
-
 
+
-
<pre class="text">hdparm -I /dev/sda | grep &quot;Serial Number&quot;
+
-
</pre>
+
-
Will give you the serial number of <code>/dev/sda</code>, which may help you identify physical disks as well.
+
-
 
+
-
==== Remove Failed Drive ====
+
-
 
+
-
If a drive has has failed, it should be removed from the <code>mdadm</code> array before being replaced.
+
-
 
+
-
<pre class="text">mdadm --manage /dev/md0 --fail /dev/sdb1
+
-
</pre>
+
-
<pre class="text">[root@lo4 ~]# mdadm --manage /dev/md0 --fail /dev/sdb1
+
-
mdadm: set /dev/sdb1 faulty in /dev/md0
+
-
</pre>
+
-
Now, we can remove it from the array.
+
-
 
+
-
<pre class="text">mdadm --manage /dev/md0 --remove /dev/sdb1
+
-
</pre>
+
-
<pre class="text">[root@lo4 ~]# mdadm --manage /dev/md0 --remove /dev/sdb1
+
-
mdadm: hot removed /dev/sdb1 from /dev/md0
+
-
</pre>
+
-
Check <code>/proc/mdstat</code>. There should no longer be any <code>(F)</code> or listed drive besides <code>sda1[0]</code>.
+
-
 
+
-
Power down the system.
+
-
 
+
-
<pre class="text">shutdown -h now
+
-
</pre>
+
-
==== Replace Drive ====
+
-
 
+
-
Now that everything is powered down, remove the failed HDD then replace it with the new one.
+
-
 
+
-
Once the drive is replaced, boot the system back up.
+
-
 
+
-
==== Add New Drive to Array ====
+
-
 
+
-
Recreate the partitioning scheme of <code>/dev/sda</code> on the new drive.
+
-
 
+
-
<pre class="text">sfdisk -d /dev/sda | fdisk /dev/sdb
+
-
</pre>
+
-
Then verify with <code>lsblk</code> or <code>fdisk -l</code>.
+
-
 
+
-
<pre class="text">mdadm --manage /dev/md0 --add /dev/sdb1
+
-
</pre>
+
-
<pre class="text">[root@lo4 ~]# mdadm --manage /dev/md0 --add /dev/sdb1
+
-
mdadm: added /dev/sdb1
+
-
</pre>
+
-
Finally, check the status of the rebuilding with
+
-
 
+
-
<pre class="text">cat /proc/mdstat
+
-
</pre>
+
-
<pre class="text">[root@lo4 ~]# cat /proc/mdstat
+
-
Personalities : [raid1]
+
-
md0 : active raid1 sdb1[1] sda1[0]
+
-
      204736 blocks super 1.0 [2/1] [U_]
+
-
      [===========&gt;.........]  recovery = 57.7% (118400/204736) finish=0.0min speed=118400K/sec
+
-
 
+
-
unused devices: &lt;none&gt;
+
-
</pre>
+
-
=== Missing Drive ===
+
-
 
+
-
In this example, <code>/dev/sda1</code> and <code>/dev/sdb1</code> make up the RAID 1 array <code>/dev/md0</code>. Let us say that <code>/dev/sdb1</code> is missing.
+
-
 
+
-
Use <code>lsblk</code> to examine HDD partitions with block sizes. Alternatively, you can use <code>fdisk -l</code> or any other utility you prefer.
+
-
 
+
-
Now check the status of <code>mdadm</code> with:
+
-
 
+
-
<pre class="text">cat /proc/mdstat
+
-
</pre>
+
-
<pre class="text">Personalities : [raid1]  
+
-
md0 : active raid1 sdb1[1](F) sda1[0]
+
-
      204736 blocks super 1.0 [2/1] [U_]
+
-
 
+
-
unused devices: &lt;none&gt;
+
-
</pre>
+
-
When a drive fails or is missing, you will see an underscore in the array output (<code>[U_]</code> instead of <code>[UU]</code>).
+
-
 
+
-
Use the output from <code>lsblk</code> and <code>/proc/mdsat</code> to match the present drive in an active <code>mdadm</code> array (<code>/dev/md0</code>) with the corresponding partition on the missing drive. For example, &quot;match&quot; <code>/dev/sda1</code> with <code>/dev/sdb1</code> (after verifying their block sizes are the same).
+
-
 
+
-
Now add <code>/dev/sdb1</code> back into the array:
+
-
 
+
-
<pre class="text">mdadm --manage /dev/md0 --add /dev/sdb1
+
-
</pre>
+
-
<pre class="text">[root@lo4 ~]# mdadm --manage /dev/md0 --add /dev/sdb1
+
-
mdadm: added /dev/sdb1
+
-
</pre>
+
-
You can view the status of the rebuilding array with:
+
-
 
+
-
<pre class="text">cat /proc/mdstat
+
-
</pre>
+
-
<pre class="text">[root@lo4 ~]# cat /proc/mdstat
+
-
Personalities : [raid1]
+
-
md0 : active raid1 sdb1[1] sda1[0]
+
-
      204736 blocks super 1.0 [2/1] [U_]
+
-
      [===========&gt;.........]  recovery = 57.7% (118400/204736) finish=0.0min speed=118400K/sec
+
-
 
+
-
unused devices: &lt;none&gt;
+
-
</pre>
+
-
== References and Resources ==
+
-
 
+
-
* http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array
+
-
* http://ubuntuforums.org/showthread.php?t=1760217
+
-
* https://raid.wiki.kernel.org/index.php/RAID_setup#RAID-1
+
-
* http://unix.stackexchange.com/questions/80501/no-etc-mdadm-conf-in-centos-6
+
= Torque PBS =
= Torque PBS =
-
 
+
[[torque|Torque PBS Documentation]]
-
== Modifying <code>pbs_server</code> Configuration ==
+
-
 
+
-
* Backup old <code>qmgr</code> / <code>pbs_server</code> configuration.
+
-
 
+
-
<pre class="text">  qmgr -c 'print server' &gt; qmgr_pbs_server.backup
+
-
</pre>
+
-
Note: you can simply list the <code>qmgr</code> <code>pbs_server</code> configuration with: <code>qmgr -c 'p s'?</code>.
+
-
 
+
-
=== Modify <code>qmgr</code> Server Variable ===
+
-
 
+
-
An example for modifying a server variable for <code>pbs_server</code> with <code>qmgr</code>.
+
-
* Unset <code>acl_hosts</code> and reset as only <code>headnode.hostname</code>.
+
-
 
+
-
<pre class="text">  $ qmgr
+
-
  $ unset server acl_hosts
+
-
  $ set server acl_hosts = headnode.hostname
+
-
</pre>
+
-
== Restarting <code>pbs_server</code> ==
+
-
 
+
-
Sometimes you need to make a change to the <code>pbs_server</code> (or add a new node).
+
-
 
+
-
The following shuts down <code>pbs_server</code> without killing jobs.
+
-
 
+
-
<pre class="text">  $ qterm -t quick
+
-
  $ pbs_server
+
-
</pre>
+
-
== Running the Head Node as a Compute Node ==
+
-
 
+
-
Make sure the correct hostname (local to the compute nodes) is specified in <code>/var/spool/torque/server_priv/nodes</code> and <code>/var/spool/torque/mom_priv/config</code>.
+
-
 
+
-
When running <code>pbs_mom</code> from the head node, it may be necessary to specify the local hostname with:
+
-
 
+
-
<pre class="text">  pbs_mom -H headnode.hostname
+
-
</pre>
+
-
If you are running <code>pbs</code> as a service, you may also need to modify the init script for <code>pbs_mom</code>.
+
-
 
+
-
== Starting <code>pbs_mom</code> at boot (<code>pbs</code> as a Service) ==
+
-
 
+
-
Copy <code>pbs_mom</code> init script into <code>/etc/init.d/</code>
+
-
 
+
-
To find where the <code>pbs_mom</code> init script is located, use the <code>locate</code> command.
+
-
 
+
-
<pre class="text">$ locate /init.d/pbs_mom
+
-
 
+
-
/mounts/layout/software/torque/torque-4.1.0/contrib/init.d/pbs_mom
+
-
/mounts/layout/software/torque/torque-4.1.0/contrib/init.d/pbs_mom.in
+
-
</pre>
+
-
<pre class="text">  cp /path/to/contrib/init.d/pbs_mom /etc/init.d/pbs_mom
+
-
</pre>
+
-
Add to <code>chkconfig</code>
+
-
 
+
-
<pre class="text">  $ chkconfig --add pbs_mom
+
-
  $ chkconfig pbs_mom on
+
-
</pre>
+
-
=== Custom init script for <code>pbs_mom</code> ===
+
-
 
+
-
You can make a copy of <code>/etc/init.d/pbs_mom</code> called something like <code>my_pbs_mom</code> in order to specify your own <code>pbs_mom</code> flags. For example if you need to specify the local hostname with <code>-H headnode.hostname</code>. If you do this, the above <code>chkconfig</code> commands should be issued with <code>my_pbs_mom</code> instead of <code>pbs_mom</code>.
+
-
 
+
-
Note: init scripts should have permissions 0755.
+
= Infiniband =
= Infiniband =
-
== Installing ==
+
[[Infiniband|Infiniband Documentation]]
-
=== Mellanox vs. RedHat Open Fabrics distributions (OFED) ===
+
= Installing CHARMM =
-
You can either get the required Infiniband packages from the RHEL package manager, or directly from Mellanox.
+
== Load latest openMPI ==
-
When making your choice, keep in mind the following:
+
<pre class="text">module load modules
-
 
+
module load gcc/4.9.0
-
<blockquote>We had oddities with our IB network until we started using the Mellanox OFED. One of the joys of OFED as an industry standard is that every IB vendor has their own perversion of it. What makes it especially frustrating is that RHEL/CentOS ship their own OFED and disentangling them in an automated way can be challenging. Mellanox OFED will uninstall RHEL OFED during its installation, but woe be unto the one who tries to do a &quot;yum upgrade&quot; at some point in the future.
+
module load openmpi
-
 
+
-
-Skylar Thompson
+
-
</blockquote>
+
-
We opted for installing the latest '''Mellanox OFED'''. Mellenox OFED will remove a previous installation of RHEL OFED. After installation we have to separate the Mellanox OFED <code>"infiniband support"</code> yum group as its own separate entity as <code>yum upgrade</code> will cause the files to be overwritten by RHEL's packages.
+
-
 
+
-
We can do this by editing the <code>/etc/yum.conf</code> file and adding an exclusion. View the packages in the yum group "Infiniband Support" using the command <pre class="text">yum groupinfo infiniband support</pre> Using a text editor, open up the <code>/etc/yum.conf</code> and add the line <pre class"text">exclude=</pre> followed by all packages you don't want upgraded (found in the inifiband support group list). No spaces should be added and use commas to separate the different packages.
+
-
 
+
-
Once the appropriate [http://www.mellanox.com/page/products_dyn?product_family=26 CentOS ISO] is downloaded, execute the following script:
+
-
 
+
-
<pre class="text">tar -xvzf /path/to/MLNX.tgz
+
-
cd MLNX
+
-
cd MLNX_OFED_LINUX-2.4-1.0.0-rhel6.5-x86_64/
+
-
ls
+
-
./mlnxofedinstall [OPTIONS]
+
</pre>
</pre>
-
Then reboot for good measure.
+
== Install <code>libquadmath.so.0</code> ==
-
== IPoIB vs. native IB, or NFS / RDMA ==
+
<pre class="text">./install.com gnu M
-
 
+
-
IPoIB implements a TCP/IB layer on top of Infiniband and adds the Host Channel Adapter (HCA) as a Network Interface Card (NIC) to the system (Ex: ib0).
+
-
 
+
-
Using Infiniband &quot;naively&quot; with NFS / RDMA potentially allows for sending messages (packets) with greater bandwidth and significantly less CPU usage / involvement, as long as you have RDMA compatible hardware of course.
+
-
 
+
-
=== Unreliable Datagram vs. Connected Mode ===
+
-
 
+
-
I have read that connected mode is comparable to using jumbo frames (thus favorable), but recently it seems datagram has become more stable and is preferred. In any case you can switch between modes at run-time with:
+
-
 
+
-
<pre class="text">echo datagram &gt; /sys/class/net/ibX/mode
+
-
echo connected &gt; /sys/class/net/ibX/mode
+
</pre>
</pre>
-
=== Set up IPoIB ===
+
== Clean (if needed) ==
-
Installing the Mellanox Infiniband drivers with the <code>--all</code> flag should configure much of <code>IPoIB</code> already. There is a network configuration file in <code>/etc/sysconfig/network-scripts/ifcfg-ib0</code>.
+
<pre class="text"> ./install.com gnu M distclean
-
 
+
-
You can configure <code>IPoIB</code> to use its own static IP address, or use the network configuration for an existing Ethernet configuration.
+
-
 
+
-
Here is an example <code>ifcfg-ib&lt;n&gt;</code> taken from the [ref:two Mellanox user manual].
+
-
 
+
-
<pre class="text"># Static settings; all values provided by this file
+
-
IPADDR_ib0=11.4.3.175
+
-
NETMASK_ib0=255.255.0.0
+
-
NETWORK_ib0=11.4.0.0
+
-
BROADCAST_ib0=11.4.255.255
+
-
ONBOOT_ib0=1
+
-
# Based on eth0; each '*' will be replaced with a corresponding octet
+
-
# from eth0.
+
-
LAN_INTERFACE_ib0=eth0
+
-
IPADDR_ib0=11.4.'*'.'*'
+
-
NETMASK_ib0=255.255.0.0
+
-
Mellanox Technologies Confidential
+
-
1.5.2-2.1.0-1.1.1000 Driver Features
+
-
82 Mellanox Technologies
+
-
NETWORK_ib0=11.4.0.0
+
-
BROADCAST_ib0=11.4.255.255
+
-
ONBOOT_ib0=1
+
-
# Based on the first eth&lt;n&gt; interface that is found (for n=0,1,...);
+
-
# each '*' will be replaced with a corresponding octet from eth&lt;n&gt;.
+
-
LAN_INTERFACE_ib0=
+
-
IPADDR_ib0=11.4.'*'.'*'
+
-
NETMASK_ib0=255.255.0.0
+
-
NETWORK_ib0=11.4.0.0
+
-
BROADCAST_ib0=11.4.255.255
+
-
ONBOOT_ib0=1
+
</pre>
</pre>
-
== Subnet Manager (OpenSM) ==
 
-
=== openSM setup===
+
= Bacula Backup Management =
-
If your infiniband switch does not support a subnet manger on the hardware you will need to set up opensm to be run by the head node.
+
2016-06-21
-
upon installation the opensm deamon will be found in <code>/etc/init.d/opensmd</code> , in order to stream-line things add the daemon (as well as any others not found in services) to your services using: <pre class "text">complete -W "$(ls /etc/init.d/)" service </pre>
+
== Fluorite (Machine) ==
-
Next, attempt to start opensm by using <pre class "text"> service opensmd start </pre>
+
The jail <code>quartz</code> is the CS bacula director which lives on the machine fluorite, <code>fluorite.earlham.edu</code>.
-
Make sure that the opensmd is set to start on boot-up <pre class "text"> chkconfig --list opensmd</pre> set to <pre class"text">chkconfig opensmd on</pre>
+
-
===troubleshooting===
+
Location of configuration file on BSD (i.e. quartz) can be found here: <code>/usr/local/etc/bacula/bacula-dir.conf/</code> and <code>/usr/local/etc/bacula/bacula-fd.conf/</code>. Each bacula client has its own <code>bacula-fd.conf</code> configuration file that points back to <code>quartz</code>.
-
Should the service not start for any reason use <code> lsmod | grep ^ib </code> to check what infiniband modules are running. Here is an example output of what you should see
+
== Helpful Commands for working with jails ==
-
<pre class "text">
+
-
ib_ucm                12120  0
+
-
ib_ipoib              122881  0
+
-
ib_cm                  42214  3 ib_ucm,rdma_cm,ib_ipoib
+
-
ib_uverbs              61976  2 rdma_ucm,ib_ucm
+
-
ib_umad                12562  0
+
-
ib_sa                  35753  5 rdma_ucm,rdma_cm,ib_ipoib,ib_cm,mlx4_ib
+
-
ib_mad                43632  4 ib_cm,ib_umad,mlx4_ib,ib_sa
+
-
ib_core              117605  12 rdma_ucm,ib_ucm,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_uverbs,ib_umad,mlx5_ib,mlx4_ib,ib_sa,ib_mad
+
-
ib_addr                7796  3 rdma_cm,ib_uverbs,ib_core
+
-
</pre>
+
-
I found that the <code> ib_umad </code> module is directly related to opensm. If it or any other modules aren't loaded you will need to add them to the <code>rc.modules</code> file
+
* <code>jls</code> lists jails
-
<pre class "text">echo modprobe "module name" >> /etc/rc.modules </pre> and then update permissions <pre class "text">chmod +x /etc/rc.modules</pre>
+
* <code>jexec &lt;JID&gt; &lt;some command&gt;</code> execute command through jail
 +
* <code>jexec &lt;JID&gt; bash</code> “connect” to jail
-
example:
+
== Bacula Commands (on Quartz) ==
-
<pre class "text">
+
-
echo modprobe u_mad >> /etc/rc.modules
+
-
chmod +x /etc/rc.modules
+
-
</pre>
+
-
=== Subnet Manager Failover ===
+
* <code>jexec 1 bconsole</code>
 +
* <code>jexec 1 /usr/local/etc/rc.d/bacula-dir restart</code>
 +
* <code>jexec 1 /usr/local/etc/rc.d/bacula-fd restart</code>
-
<blockquote>Setting up failover for opensm isn't challenging, but it is good to document which nodes are the subnet managers as the behavior of the network will be strange without any of the managers running. We discovered that with our GPFS cluster when we accidentally rebooted both managers at the same time - no nodes could join the network, including the subnet managers, until we took some manual action.
 
-
-Skylar Thompson
+
= Using Infiniband for Layout's NFS =
-
</blockquote>
+
Internal NFS mounts on Layout are now done over Infiniband
-
Failover is necessary when ruining a subnet manager (SM) on your Infiniband machines (rather than a switch). Essentially, failover is a configuration that ensures that if one machine goes down, there is guaranteed to be a SM running on another machine. With Infiniband, you need an SM to be active, otherwise the machines will not be able to communicate with each other.
+
-
== Switch Configuration ==
+
<ol>
-
 
+
<li>add to <code>/etc/hosts</code> of all layout nodes
-
=== Initial Setup ===
+
<ul>
-
 
+
<li><code>192.168.50.100 lo0.layout.ib</code></li></ul>
-
Certain Infiniband switches can run a subnet manager. This is ideal and in this situation, failover is not necessary. To configure our switch, the Mellanox SX6018, you need to connect the console port to the serial port of an Infiniband machine. Next, install and run the serial terminal program <code>minicom</code> and login with username: <code>admin</code> and password: <code>admin</code>. Go through the configuration wizard (the defaults are fine). We did not enable IPv6.
+
</li>
-
 
+
<li>update <code>/etc/exports</code> on lo0 with infiniband IP address
-
Installing <code>minicom</code>
+
<ul>
-
 
+
<li>then run <code>exportfs -a</code> to update nfs</li>
-
<pre class="text">yum install minicom
+
<li>use <code>showmount</code> to check mounted machines</li></ul>
 +
</li>
 +
<li>change nfs mounts to ib fabric on layout nodes</li>
 +
<li><p>un-mount then mount under lo0.layout.ib</p>
 +
<pre class="text"># umount /scratch
 +
# mount lo0.layout.ib:/scratch /scratch
 +
# umount /mounts
 +
# mount lo0.layout.ib:/mounts /mounts
 +
# umount /var/www/
 +
# mount lo0.layout.ib:/var/www/ /var/www/
</pre>
</pre>
-
And set the port to <code>/dev/ttyS0</code>
+
<ul>
 +
<li>if it says &quot;device is busy&quot;, then you can try <code>umount -l /some/mount</code></li></ul>
 +
</li>
 +
<li><p>update <code>/etc/fstab</code> to use lo0.layout.ib hostname</p></li></ol>
-
<pre class="text">minicom -s
+
NFS Reference: https://unix.stackexchange.com/questions/106122/mount-nfs-access-denied-by-server-while-mounting-on-ubuntu-machines
-
</pre>
+
-
Running the switch setup wizard. Run <code>minicom</code> and login. Then run the following commands proceeding the <code>&gt;</code> or <code>#</code>.
+
-
<pre class="text">switch &gt; enable
+
= Routing between 10Gb and IB in Layout =
-
switch # configure terminal
+
-
switch (config) # jump-start
+
-
</pre>
+
-
From the Mellanox switch manual:
+
-
<blockquote>Before attempting a remote (for example, SSH) connection to the switch, check the <code>mgmt0</code> interface configuration. Specifically, verify the existence of an IP address. To check the current mgmt0 configuration, enter the following commands:
+
In order to route traffic over the 10Gb network to layout's compute nodes, we need lo0 to mediate the exchange.
-
</blockquote>
+
-
Note that the commands start after the <code>&gt;</code> or <code>#</code>.
+
-
<pre class="text">switch &gt; enable
+
To temporarily establish routing between severs we can use the <code>ip route add</code> command <code>ip route add 10.10.10.0/24 via 192.168.50.100 dev ib0</code>
-
switch # configure terminal
+
-
switch (config) # show interfaces mgmt0
+
-
</pre>
+
-
=== Enabling / Running the Subnet Manager ===
+
In order to make this presistant, we must create the file <code>/etc/sysconfig/network-scripts/route-ib0</code> with the static routing rule inside it.
-
You can enable, manage, configure, and run the subnet manager (along with many other things) through the Mellanox switch web interface control (management) panel. However, if you don't want to bother with getting it working, you can simply enable the subnet manager straight from a logged-in <code>minicom</code> switch prompt.
+
* [https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Deployment_Guide/s1-networkscripts-static-routes.html static routing with centos]
-
 
+
-
Again, the commands start after the <code>&gt;</code> and <code>#</code>.
+
-
 
+
-
<pre class="text">switch &gt; enable
+
-
switch # configure terminal
+
-
switch (config) # ib sm
+
-
</pre>
+
-
== OpenMPI testing ==
+
-
 
+
-
We tested OpenMPI using an prime number generator script found here: <code>/cluster/home/charliep/cvs-hopper/primes/</code>. We ran <code>primes_batch</code> with <code>mpirun</code>, specifying the desired amount of machines using a <code>machinefile</code> / <code>hostfile</code>.
+
-
 
+
-
=== Creating a Machine / Hosts File ===
+
-
 
+
-
A machinefile, or hostfile lists information about the nodes for <code>mpirun</code> to use. You should be able to <code>make</code> an appropriate file and run it on the connected machines using Infiniband.
+
-
 
+
-
<pre class="text">make primes_batch
+
-
mpirun primes_batch --np=4 -hostfile=hostfile primes_batch
+
-
</pre>
+
-
== General testing ==
+
-
 
+
-
The OFED comes with loads of testing programs.
+
-
 
+
-
* <code>ibping</code>
+
-
* <code>ibdiagnet</code>
+
-
* <code>ibstatus</code>
+
-
* <code>ibstat</code>
+
-
 
+
-
== Testing with CHARMM ==
+
-
 
+
-
== References ==
+
-
 
+
-
* http://www.mellanox.com/page/products_dyn?product_family=26
+
-
* <span id="ref:two"></span>http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_User_Manual_v2.2-1.0.1.pdf
+
-
* http://www.shocksolution.com/2012/12/installing-and-configuring-infiniband-on-a-red-hat-system/
+
-
* https://access.redhat.com/solutions/301643
+
-
* https://niktips.wordpress.com/2011/02/02/activating-infiniband-stack-in-linux/
+
-
* https://software.intel.com/en-us/articles/understanding-the-infiniband-subnet-manager/
+
-
* https://docs.oracle.com/cd/E19802-01/820-2189-10/ib-nem-sw-overview.html
+
-
* http://people.redhat.com/dledford/infiniband_get_started.html
+
-
* http://pkg-ofed.alioth.debian.org/howto/infiniband-howto.html
+
-
* https://www.kernel.org/doc/Documentation/infiniband/ipoib.txt
+
-
* http://www.mellanox.com/pdf/whitepapers/InfiniBandFAQ_FQ_100.pdf
+
-
* http://www.mcs.anl.gov/~balaji/pubs/2010/ispass/ispass10.ipoib.pdf
+
-
* http://www.bctes.com/nat-linux-iptables.html
+
-
* http://www.mellanox.com/page/products_dyn?product_family=150&mtag=sx6015_sx6018
+
-
* https://thegeekinthecorner.wordpress.com/category/infiniband-verbs-rdma/
+
-
* http://www.mellanox.com/related-docs/user_manuals/SX60XX_User_Manual.pdf
+
-
* http://www.cyberciti.biz/tips/connect-soekris-single-board-computer-using-minicom.html
+
-
* http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi/linux/bks/SGI_Admin/books/ICEX_Admin_Guide/sgi_html/ch04.html#Z1226348317tls
+
-
* https://www.kernel.org/doc/Documentation/filesystems/nfs/nfs-rdma.txt
+
-
 
+
-
= Installing CHARMM =
+
-
 
+
-
== Load latest openMPI ==
+
-
 
+
-
<pre class="text">module load modules
+
-
module load gcc/4.9.0
+
-
module load openmpi
+
-
</pre>
+
-
== Install <code>libquadmath.so.0</code> ==
+
-
 
+
-
<pre class="text">./install.com gnu M
+
-
</pre>
+
-
== Clean (if needed) ==
+
-
 
+
-
<pre class="text">  ./install.com gnu M distclean
+
-
</pre>
+

Latest revision as of 16:18, 13 October 2017

Contents

Current To Do

Cluster Pages

New Software

For a lot of the software we install on the clusters, we install them as modules. Environment modules are an easy way of installing multiple versions of software and allowing users to trivially change their environment variables path (PATH, LD_LIBRARY_PATH, C_INCLUDE_PATH, etc. ) to point to which version they want to use. If I want to use gcc version 5.1.0 instead of the system default 4.4.7, all I would have to type is:

    module load gcc/5.1.0  

If gcc 4.4.7 were also a module and I wanted to swap them to make sure I'm using gcc 5.1.0, all I would type is:

    module swap gcc/4.4.7 gcc/5.1.0 

Before installing the software, figure-out if this should be a yum package or a source kit, installed into the system space or modules, and what library dependencies it has. Proceed as appropriate.

Building software packages

Needs to be Updated

Installing a yum package into Modules structure

Enabling built software packages within Modules structure

On the head node all of the clusters, modules are installed into /mounts/machine/software where machine is the name of the actual machine. This directory is visible to all of the nodes of the machine. In that directory, there should be subdirectories of the different modules that are available, and within those each version has it's own directory (as of July 15 2015, the modules setup and organization is messy on many of the cluster, the notes here are how it should be set up from here on out).

So, for example, if we have gcc versions 5.1.0, 4.7.1, and 4.9.0 installed on layout, it would look like this:

   $ ls -1 /mounts/fatboy/software/gcc
      5.1.0 
      4.7.1 
      4.9.0 

If you think your new package is important enough to be loaded by default, then add it to the list in /mounts/al-salam/software/Modules/3.2.7/init/al-salam.{sh,csh}

DNS/DCHP for a single host

Find an IP that's not in use. Easiest way to do that is look in this file /var/named/master/cluster.zone. Add name and IP like the pattern in the file, like below. At the top of the file, be sure to change the serial number at the top to represent the year, month, day, and version.

    dali.cluster.earlham.edu.	IN	A	159.28.234.126 

Save the file. Every time you add an entry to the zone file, you have to edit the reverse zone file. The reverse zone file is /var/named/master/159.28.234.zone. Add an entry for the host you added in the zone file. Notice the first number there is the last octet of the IP that you gave the host.

    126	IN	PTR	dali.cluster.earlham.edu. 

Next you'll want to stop DNS and then start DNS with the following command.

   service named stop
   service named start

Now that DNS is updated, we have to update DHCP. The file you want to edit is /etc/dhcp/dhcpd.conf. Towards the bottom of the file you'll add

    host <hostname> { hardware ethernet <MACaddress> fixed-address <hostname>.cluster.earlham.edu; .

Save the file. Just like we did for the DNS config file, we need to stop and the start DHCP.

   service dhcpd stop 
   service dhcpd start

As a test, reboot the client.

Setting up LDAP

When installing and configuring ldap, it can be tedious and frustrating, but no worries! I went through the troubles and took notes as I went so no one else would have to suffer like I did! These notes are pretty detailed, but I would suggest using one of the other servers with a newer centos version (layout, fatboy) as a resource when installing and configuring, especially if you are configuring it for a cluster.

Packages that need to be installed (both head and compute nodes):

We use NSS and NSLCD in conjunction with PAM for ldap authentication. It may be older than SSSD, but we already know how to do it. So, we want to turn off SSSD. If sssd is not running, then great, that'll make your life a lot easier!

   service stop sssd
   chkconfig sssd off #so it doesn't restart if the machines reboots
   chkconfig --del sssd #delete it as a service because we don't want it

There are a lot of files that need to be modified in order for ldap to work correctly.

   URI ldap://cluster.earlham.edu/
   BASE dc=cluster, dc=loc
   TLS_CACERTDIR /etc/openldap/cacerts
    passwd:  ldap files
    group:  ldap files
    shadow: ldap files

    ethers:     files
    netmasks:   files
    networks:   files
    protocols:  files
    rpc:        files
    services:   files ldap

    netgroup:   ldap files

    publickey:  nisplus

    automount:  files ldap
    aliases:    files

    sudoers:    ldap files
   rootbinddn cn=Manager,dc=cluster,dc=loc
   nss_base_passwd ou=people,dc=cluster,dc=loc?one
   nss_base_shadow ou=people,dc=cluster,dc=loc?one
   nss_base_group ou=group,dc=cluster,dc=loc?one
   nss_map_objectclass posixAccount User
   nss_map_objectclass shadowAccount User
   nss_map_objectclass posixGroup Group
   nss_map_attribute uid userName
   nss_map_attribute gidNumber gid
   nss_map_attribute uidNumber uid
   nss_map_attribute cn groupName
   base dc=cluster,dc=loc
   pam_password crypt
   uri ldap://cluster.earlham.edu/
   ssl no
   tls_cacertdir /etc/openldap/cacert
   uri ldap://cluster.earlham.edu/
   instead of pam_sss.o, it should be pam_ldap.so
   base   group  ou=group,dc=cluster,dc=loc
   base   passwd ou=people,dc=cluster,dc=loc
   base   shadow ou=people,dc=cluster,dc=loc
   uid nslcd
   gid ldap
   uri ldap://cluster.earlham.edu/
   base dc=cluster, dc=loc
   ssl no
   tls_cacertdir /etc/openldap/cacerts
   UsePAM yes
   USESSSDAUTH=no 
   USESHADOW=yes 
   USESSSD=no 
   USELDAPAUTH=yes 
   USELDAP=yes
   USECRACKLIB=yes 
   PASSWDALGORITHM=descrypt

Since we deleted sssd, we need to start the alternative, and make sure it starts on boot up.

   service nslcd start
   chkconfig nslcd on

Check to make sure nscd is turned off. That is a caching service for ldap. Since we're so small here, we don't really need that.

   service nscd stop
   chkconfig nscd off

Users and Groups

Users are authenticated using an LDAP (Lightweight Directory Access Protocol) server running on Hopper. This is how users are authenticated throughout the entire cluster realm. We use LDAP for all users and groups except for ccg-admin user, root user, and the wheel group. Those users and that group are local to each cluster. Every user is apart of the users LDAP group, which is group number 115, and all clusters should look at ldap first and then files. This is specified in the /etc/nsswitch.conf file.

A user can change their password by the passwd command while on Hopper. It will prompt them for their current password, and then their desired new password. If it's successful it will tell you that at the end, with something like 'All LDAP tokens successfully changed.' or something close to that.

Creating New Users

Because creating users in LDAP is somewhat confusing, sample files and a python script were written to help. The script is addusers.py and lives in ~root/ldap-files/ on hopper. I'll explain things on here, but there's a README file in that directory that will explain everything as well.

To create a user in LDAP, you must create an .ldif file for that user. This is what addusers.py does for you. addusers.py takes a file of new users as a command line argument. The file must specify First Name:username:email for each user, and each user should be on a separate line. The file add-user.ldif is an example of what the file should look like.

sudo python addusers.py add-user.ldif will create an .ldif file for each user, and use ldapadd to add them to the LDAP database. The contents of the .ldif file for each user added will be printed to the screen, and each user will be sent an email with their username and password. addusers.py is set to clean up after itself, so you don't need to worry about that. There's one more thing that has to be done after this step. We need to setup the ssh keys for each user. For each user created:

Become that user:

    su - user 

SSH to as0:

    ssh as0  

It will ask you for the password of the user. Then it will prompt you for information about where to save the public key file and for a passphrase. For all of these, just hit enter. That will set it to the default. Go back to hopper and do the same thing for all the new users.

It is VERY important that you use UID and GID numbers that have not already been taken. If new users and groups have been being added correctly, then there shouldn't be a problem with that. maxuid is a file that specifies the next UID to use when creating a new user. addusers.py reads from that file when creating the .ldif files for each user, and at the end overwrites that file with the new naxuid. If you're nervous about the UID numbers, it is ok to double check. Doing ldapsearch -x should output everything in the database with the latest entry at the bottom. Look for the UID in that entry and compare it with maxuid. If maxuid is one above that number then all is golden. It's also safe to look in /etc/passwd to make sure no one is using that number either.

Other modifications to the DB

Other modifications to the database, like adding a new group, adding users to that group, deleting users, all use .ldif files similar to adding users. In the same directory as the above files, there are sample .ldif files that do these operations. Each file's name should be what it does. add-group.ldif will add a group, using ldapadd command. add-user-to-group.dif can be used to add a user to a group, and del-from-grp.ldif can delete users from groups, and chg-pw show an example of changing the password of a user. All three using the ldapmodify command. Make sure you modify the files to what you need, especially don't forget to change the gid if you add a group. Make sure it's a GID that's not already in use.

The command for modifying the database, if i was adding a user to a group. You'll need to specify the Manager password to the end of this command. It has been redacted here but can be found in the README file, only with root privileges.

    ldapmodify -f add-user-to-group.ldif -D "cn=Manager,dc=cluster,dc=loc" 

If you're adding a group, change ldapmodify to ldapadd. The cn=Manager stuff is just specifying the manager of the database, which will allow you to change it.

To delete a user from the ldap database, it's a little simpler. Use the command below. Again, the Manager password must be specified and has been redacted here, but it will be the same as the password used in the other commands, and can be found in the README file. You will need to change the uid to be equal to the uid of the person you want to delete. The uid here is just the person's username.

    ldapdelete "uid=sbsp,ou=people,dc=cluster,dc=loc" -D "cn=Manager,dc=cluster,dc=loc" 

Monitoring

Disable graphical booting screen in CentOS

To enable verbose booting and remove the loading bar graphic, simply remove rhgb quiet from the file /boot/grub/grub.conf.

rhgb stands for redhat graphical boot, the quiet option tells CentOS to suppress even more boot information.

Grub2 works a little differently. There is no grub.conf file. Instead, it is generated from a bunch of files living inside /etc/grup.d/.

To disable graphical boot on grub2:

1. backup default grub
cp /etc/default/grub /etc/default/grub.bak
2. Remove rhgb quiet from the line GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet"
vim /etc/default/grub
3. generate new grub config
grub2-mkconfig --output=/boot/grub2/grub.cfg

Rebuilding and Creating RAID arrays with mdadm

mdadm RAID Documentation

Torque PBS

Torque PBS Documentation

Infiniband

Infiniband Documentation

Installing CHARMM

Load latest openMPI

module load modules
module load gcc/4.9.0
module load openmpi

Install libquadmath.so.0

./install.com gnu M

Clean (if needed)

  ./install.com gnu M distclean

Bacula Backup Management

2016-06-21

Fluorite (Machine)

The jail quartz is the CS bacula director which lives on the machine fluorite, fluorite.earlham.edu.

Location of configuration file on BSD (i.e. quartz) can be found here: /usr/local/etc/bacula/bacula-dir.conf/ and /usr/local/etc/bacula/bacula-fd.conf/. Each bacula client has its own bacula-fd.conf configuration file that points back to quartz.

Helpful Commands for working with jails

Bacula Commands (on Quartz)


Using Infiniband for Layout's NFS

Internal NFS mounts on Layout are now done over Infiniband

  1. add to /etc/hosts of all layout nodes
    • 192.168.50.100 lo0.layout.ib
  2. update /etc/exports on lo0 with infiniband IP address
    • then run exportfs -a to update nfs
    • use showmount to check mounted machines
  3. change nfs mounts to ib fabric on layout nodes
  4. un-mount then mount under lo0.layout.ib

    # umount /scratch
    # mount lo0.layout.ib:/scratch /scratch
    # umount /mounts
    # mount lo0.layout.ib:/mounts /mounts
    # umount /var/www/
    # mount lo0.layout.ib:/var/www/ /var/www/
    
    • if it says "device is busy", then you can try umount -l /some/mount
  5. update /etc/fstab to use lo0.layout.ib hostname

NFS Reference: https://unix.stackexchange.com/questions/106122/mount-nfs-access-denied-by-server-while-mounting-on-ubuntu-machines

Routing between 10Gb and IB in Layout

In order to route traffic over the 10Gb network to layout's compute nodes, we need lo0 to mediate the exchange.

To temporarily establish routing between severs we can use the ip route add command ip route add 10.10.10.0/24 via 192.168.50.100 dev ib0

In order to make this presistant, we must create the file /etc/sysconfig/network-scripts/route-ib0 with the static routing rule inside it.

Personal tools
Namespaces
Variants
Actions
websites
wiki
this semester
Toolbox