Lab - Build a Cluster: Compute Node

Click on the link for updated version of ther Lab

Objective: Setting up ssh host-based authentication and Ansible cluster management. Learn how to use Ansible to setup AutoFS and deploy software packages on the nodes.

Steps:

  • Setting ssh host-based authentication.

  • Install Ansible and configure Ansible environment for the cluster.

  • Use Ansible playbooks to:

    • Install packages.

    • Set time synchronization.

    • Configure NFS exports and AutoFS maps.


SSH host-based authentication setup

Host-based authentication is needed for users to ssh to the compute nodes without password. Alternative solution would be using the private/public ssh keys, but that would need to be done for every user on the cluster.

  • To enable ssh host based authentication on a client, one needs modify ssh_config.

  • On the server side, modify sshd_config.

  • Collect the ssh public node host keys in file /etc/ssh/ssh_known_hosts.

  • Restart sshd daemon.

  • To accomplish the tasks above on all the nodes, we’ll download a tarball with scripts that will loop over the nodes and run the tasks:

sudo dnf install wget
wget https://linuxcourse.rutgers.edu/LCI_2023/SSH_hostbased.tgz
tar -zxvf SSH_hostbased.tgz
cd SSH_hostbased

Create public/private ssh key pair, copy id_rsa.pub to known_hosts on all the compute nodes by running the script below. Provide the ssh password when prompted:

./ssh_root_key.sh

Try ssh to any of the nodes to verify if the private/public key authentication works:

ssh compute1

Setup host based ssh authrntication by running the script below:

./node_loop.sh

Check if you can ssh as user instructor with host based authentication:

ssh -v compute1

Install Ansible

Install Ansible on the head node:

sudo dnf install -y ansible-core

Install Ansible on all the compute nodes:

cd SSH_hostbased
./ansible_install.sh

Run command

ansible --version

Setup Ansible environment.

If you are using Rocky linux: Download lci-labs-rocky.tgz in the home directory of instructor:

cd
wget https://linuxcourse.rutgers.edu/LCI_2023/lci-labs-rocky.tgz
tar -zxvf lci-labs-rocky.tgz
cp -a lci-labs-rocky lci-labs

If you are using Ubuntu linux: Download lci-labs-ubuntu.tgz in the home directory of instructor:

cd
wget https://linuxcourse.rutgers.edu/LCI_2023/lci-labs-ubuntu.tgz
tar -zxvf lci-labs-ubuntu.tgz
cp -a lci-labs-ubuntu lci-labs

Directory lci-labs contains configuration files, packages, and playbooks you need to build a cluster.

Now we can test the ansible shell command by running a basic command on the hosts. Lets run a command on the compute nodes and see how that would look:

cd lci-labs
ansible all_nodes -m ansible.builtin.shell -a "uname -n"

Did the command execute on all of the compute nodes?

Above we use ansible command ansible on the host group all_nodes.

-m for the ansible.builtin.shell module name

-a for the module arguements

uname -n - command we want to execute on hosts defined as all_nodes

We will use the ansible.bultin.shell module plus many other modules in the coming labs. We also have a task and a plays directory provisioned for you which we use to execute individual plays. In addition we also use a playbook to play all of the tasks at once.


The top Ansible playbook for the cluster

We’ll be using Ansible for all cluster installations.

In directory lci-labs, create new file build.yml with the following content:

---
############## Head node play ##################################
- name: Head node configuration
  become: yes
  hosts: head
  connection: local
  roles:
  
############## Compute node play ###############################
- name: Compute node configuration
  become: yes
  hosts: all_nodes
  roles:

This is your first Ansible playbook. First, check it for syntactic errors now:

ansible-playbook --syntax-check build.yml

Do a dry run:

ansible-playbook --check build.yml

Now run it:

ansible-playbook  build.yml

You should see the connection (Gathering Facts) response from the head and compute nodes. No roles or tasks to run yet.

Disregard files playbook.yml and destroy.yml for now.


Installation roles an tasks on the Rocky head node

By performing the roles below on the head node, we

  • Enable the EPEL repo and powertools in dnf

  • Ensure timezone is in sync with America/New_York

  • Install head node packages, including compiler, development tools, rpmbuild tools, mariadb (MySQL server)

Include the following roles in the head play in your build.yml under roles:

    - powertools
    - timesync
    - head-node_pkg_inst

Check for syntactic errors:

ansible-playbook --syntax-check build.yml

Run the playbook:

ansible-playbook build.yml

Installation roles and tasks on the Rocky compute nodes

By performing the tasks below on the compute nodes, we

  • Enable the EPEL repo and powertools in dnf

  • Ensure timezone is in sync with America/New_York

  • Install perl, perl-DBI

Include the following roles in the comple node play in build.yml

    - powertools
    - timesync
    - compute-node_pkg_inst

Check for syntactic errors:

ansible-playbook --syntax-check build.yml

Run the playbook:

ansible-playbook build.yml

Installation roles and tasks on the Ubuntu head node

By performing the roles below on the head node, we

  • Ensure timezone is in sync with America/New_York

  • Install head node packages, including compiler, development tools, mariadb (MySQL server), etc

Include the following roles in the head play in your build.yml under roles:

    - timesync
    - head-node_pkg_inst

Check for syntactic errors:

ansible-playbook --syntax-check build.yml

Run the playbook:

ansible-playbook build.yml

Installation roles and tasks on the Ubuntu compute nodes

By performing the tasks below on the compute nodes, we

  • Ensure timezone is in sync with America/New_York

  • Install compilers

Include the following roles in the comple node play in build.yml

    - timesync
    - compute-node_pkg_inst

Check for syntactic errors:

ansible-playbook --syntax-check build.yml

Run the playbook:

ansible-playbook build.yml

NFS setup roles and tasks on the cluster

Need to create directory /head/NFS on the head node, share it with the compute nodes, then automount it on the compute nodes.

Below, we add Ansible NFS roles to accdomplish the tasks.

Include the NFS server role in the head node play in build.yml:

    - head-node_nfs_server

Include the autofs role in the compute node play in build.yml:

    - compute-node_autofs

In the end of the lab, the build.yml on Rocky should look as follows now:

On Ubuntu, the build.yml should look as follows:

Check for syntactic errors:

ansible-playbook --syntax-check build.yml

Run the playbook:

ansible-playbook build.yml

At this point, the head and compute node installation is essentially complete, except for the scheduler part.