Lab - Build a Cluster: Compute Node
Contents
Lab - Build a Cluster: Compute Node¶
Click on the link for updated version of ther Lab
Objective: Setting up ssh host-based authentication and Ansible cluster management. Learn how to use Ansible to setup AutoFS and deploy software packages on the nodes.
Steps:
Setting ssh host-based authentication.
Install Ansible and configure Ansible environment for the cluster.
Use Ansible playbooks to:
Install packages.
Set time synchronization.
Configure NFS exports and AutoFS maps.
SSH host-based authentication setup¶
Host-based authentication is needed for users to ssh to the compute nodes without password. Alternative solution would be using the private/public ssh keys, but that would need to be done for every user on the cluster.
To enable ssh host based authentication on a client, one needs modify ssh_config.
On the server side, modify sshd_config.
Collect the ssh public node host keys in file
/etc/ssh/ssh_known_hosts.
Restart sshd daemon.
To accomplish the tasks above on all the nodes, we’ll download a tarball with scripts that will loop over the nodes and run the tasks:
sudo dnf install wget
wget https://linuxcourse.rutgers.edu/LCI_2023/SSH_hostbased.tgz
tar -zxvf SSH_hostbased.tgz
cd SSH_hostbased
Create public/private ssh key pair, copy id_rsa.pub to known_hosts on all the compute nodes by running the script below. Provide the ssh password when prompted:
./ssh_root_key.sh
Try ssh to any of the nodes to verify if the private/public key authentication works:
ssh compute1
Setup host based ssh authrntication by running the script below:
./node_loop.sh
The content of ssh_key_scan.sh script:
#!/bin/bash
hosts='head compute1 compute2 compute3 compute4
192.168.0.4 192.168.0.5 192.168.0.6 192.168.0.7 192.168.0.8'
for i in $hosts
do
ssh-keyscan -t rsa $i >> ssh_known_hosts
done
The content of ssh_root_key.sh script for private/public key authentication:
#!/bin/bash
echo 'Input password for ssh:'
read password
hosts='compute1 compute2 compute3 compute4'
ssh-keygen -q -t rsa -N '' -f ~/.ssh/id_rsa <<<y >/dev/null 2>&1
cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
./ssh_key_scan.sh
cp ssh_known_hosts ~/.ssh/known_hosts
if [ -f /etc/lsb-release ]
then
# Ubuntu
cmd='sudo apt install sshpass --yes'
else
# Rocky
cmd='sudo dnf install sshpass -y'
fi
eval ${cmd}
for i in $hosts
do
echo $i
sshpass -p $password scp -r ~/.ssh/authorized_keys $i:.ssh
done
The content of node_loop.sh for host based authentication:
#!/bin/bash
#Host based SSH authentication setup on compute nodes
# Define the the compute node list:
hosts='compute1 compute2 compute3 compute4'
# create directory /etc/ssh/sshd_config.d if doesn't exist:
cmd1='[ -d /etc/ssh/sshd_config.d ] || mkdir /etc/ssh/sshd_config.d'
# Reference the directory in /etc/ssh/sshd_config:
str='Include /etc/ssh/sshd_config.d/*.conf'
cmd2="grep -iF \"${str}\" /etc/ssh/sshd_config > /dev/null || echo \"${str}\" >> /etc/ssh/sshd_config"
#Loop over the compute nodes
for i in ${hosts}
do
echo $i
scp {hosts.equiv,ssh_lci.conf,ssh_known_hosts,sshd_lci.conf} $i:/tmp
ssh $i "sudo cp /tmp/hosts.equiv /etc"
ssh $i "sudo cp /tmp/ssh_lci.conf /etc/ssh/ssh_config.d"
ssh $i "sudo cp /tmp/ssh_known_hosts /etc/ssh"
echo ${cmd1} | ssh $i sudo /bin/bash -s
ssh $i "sudo cp /tmp/sshd_lci.conf /etc/ssh/sshd_config.d"
echo ${cmd2} | ssh $i sudo /bin/bash -s
ssh $i "sudo systemctl restart sshd"
done
# Configure ssh client on the head node:
sudo cp ssh_lci.conf /etc/ssh/ssh_config.d
sudo cp ssh_known_hosts /etc/ssh
Check if you can ssh as user instructor with host based authentication:
ssh -v compute1
Install Ansible¶
Install Ansible on the head node:
sudo dnf install -y ansible-core
Install Ansible on all the compute nodes:
cd SSH_hostbased
./ansible_install.sh
content of ansible_install.sh:
#!/bin/bash
if [ -f /etc/lsb-release ]
then
# Ubuntu
cmd='sudo apt-add-repository ppa:ansible/ansible --yes; sudo apt update; sudo apt install ansible --yes'
else
# Rocky
cmd='sudo dnf install -y ansible-core'
fi
eval ${cmd}
hosts='compute1 compute2 compute3 compute4'
for i in $hosts
do
ssh $i ${cmd}
done
Run command
ansible --version
Setup Ansible environment.¶
If you are using Rocky linux:
Download lci-labs-rocky.tgz
in the home directory of instructor:
cd
wget https://linuxcourse.rutgers.edu/LCI_2023/lci-labs-rocky.tgz
tar -zxvf lci-labs-rocky.tgz
cp -a lci-labs-rocky lci-labs
If you are using Ubuntu linux:
Download lci-labs-ubuntu.tgz
in the home directory of instructor:
cd
wget https://linuxcourse.rutgers.edu/LCI_2023/lci-labs-ubuntu.tgz
tar -zxvf lci-labs-ubuntu.tgz
cp -a lci-labs-ubuntu lci-labs
Directory lci-labs
contains configuration files, packages, and playbooks you need to build a cluster.
The configuration file, ansible.cfg:
[defaults]
inventory = hosts.ini
remote_user = instructor
host_key_checking = false
remote_tmp = /tmp/.ansible/tmp
interpreter_python = /bin/python3
forks = 4
[privilege_escalation]
become = true
become_method = sudo
become_user = root
become_ack_pass = false
The inventory file, hosts.ini:
[all_nodes]
compute1
compute2
compute3
compute4
[head]
localhost ansible_connection=local ansible_python_interpreter=/usr/bin/python3.9
Now we can test the ansible shell command by running a basic command on the hosts. Lets run a command on the compute nodes and see how that would look:
cd lci-labs
ansible all_nodes -m ansible.builtin.shell -a "uname -n"
Did the command execute on all of the compute nodes?
Above we use ansible command ansible on the host group all_nodes.
-m
for the ansible.builtin.shell module name
-a
for the module arguements
uname -n
- command we want to execute on hosts defined as all_nodes
We will use the ansible.bultin.shell
module plus many other modules in the coming labs. We also have a task and a plays directory provisioned for you which we use to execute individual plays. In addition we also use a playbook to play all of the tasks at once.
The top Ansible playbook for the cluster¶
We’ll be using Ansible for all cluster installations.
In directory lci-labs
, create new file build.yml
with the following content:
---
############## Head node play ##################################
- name: Head node configuration
become: yes
hosts: head
connection: local
roles:
############## Compute node play ###############################
- name: Compute node configuration
become: yes
hosts: all_nodes
roles:
This is your first Ansible playbook. First, check it for syntactic errors now:
ansible-playbook --syntax-check build.yml
Do a dry run:
ansible-playbook --check build.yml
Now run it:
ansible-playbook build.yml
You should see the connection (Gathering Facts) response from the head and compute nodes. No roles or tasks to run yet.
Disregard files playbook.yml
and destroy.yml
for now.
Installation roles an tasks on the Rocky head node¶
By performing the roles below on the head node, we
Enable the EPEL repo and powertools in dnf
Ensure timezone is in sync with America/New_York
Install head node packages, including compiler, development tools, rpmbuild tools, mariadb (MySQL server)
Include the following roles in the head play in your build.yml
under roles
:
- powertools
- timesync
- head-node_pkg_inst
Check for syntactic errors:
ansible-playbook --syntax-check build.yml
Run the playbook:
ansible-playbook build.yml
Installation roles and tasks on the Rocky compute nodes¶
By performing the tasks below on the compute nodes, we
Enable the EPEL repo and powertools in dnf
Ensure timezone is in sync with America/New_York
Install perl, perl-DBI
Include the following roles in the comple node play in build.yml
- powertools
- timesync
- compute-node_pkg_inst
Check for syntactic errors:
ansible-playbook --syntax-check build.yml
Run the playbook:
ansible-playbook build.yml
Installation roles and tasks on the Ubuntu head node¶
By performing the roles below on the head node, we
Ensure timezone is in sync with America/New_York
Install head node packages, including compiler, development tools, mariadb (MySQL server), etc
Include the following roles in the head play in your build.yml
under roles
:
- timesync
- head-node_pkg_inst
Check for syntactic errors:
ansible-playbook --syntax-check build.yml
Run the playbook:
ansible-playbook build.yml
Installation roles and tasks on the Ubuntu compute nodes¶
By performing the tasks below on the compute nodes, we
Ensure timezone is in sync with America/New_York
Install compilers
Include the following roles in the comple node play in build.yml
- timesync
- compute-node_pkg_inst
Check for syntactic errors:
ansible-playbook --syntax-check build.yml
Run the playbook:
ansible-playbook build.yml
NFS setup roles and tasks on the cluster¶
Need to create directory /head/NFS
on the head node, share it with the compute nodes,
then automount it on the compute nodes.
Below, we add Ansible NFS roles to accdomplish the tasks.
Include the NFS server role in the head node play in build.yml:
- head-node_nfs_server
Include the autofs role in the compute node play in build.yml:
- compute-node_autofs
In the end of the lab, the build.yml
on Rocky should look as follows now:
build.yml on Rocky:
---
####### head node play ############
- name: Head node configuration
become: yes
hosts: head
connection: local
tags: head_node_play
roles:
- powertools
- timesync
- head-node_pkg_inst
- head-node_nfs_server
####### compute node play ########
- name: Compute node configuration
become: yes
hosts: all_nodes
tags: compute_node_play
roles:
- powertools
- timesync
- compute-node_pkg_inst
- compute-node_autofs
On Ubuntu, the build.yml should look as follows:
build.yml on Ubuntu:
---
####### head node play ############
- name: Head node configuration
become: yes
hosts: head
connection: local
tags: head_node_play
roles:
- timesync
- head-node_pkg_inst
- head-node_nfs_server
####### compute node play ########
- name: Compute node configuration
become: yes
hosts: all_nodes
tags: compute_node_play
roles:
- timesync
- compute-node_pkg_inst
- compute-node_autofs
Check for syntactic errors:
ansible-playbook --syntax-check build.yml
Run the playbook:
ansible-playbook build.yml
At this point, the head and compute node installation is essentially complete, except for the scheduler part.