14. Ansible Configuration Management

14.1. Outline of the topic:


  • Major server installation types and configuration management challenges

  • Detail what configuration management is and why it is useful

  • Review the current landscape of available tools

  • Ansible basics and examples:

    • configuration files

    • playbooks

    • commands for remote tasks

14.2. Major automated installation types and management challenges

Traditional diskful (“stateful”) compute nodes

  • The operating system and applications reside on the local system drive.

  • They are preserved at system reboot.

  • Configuration management challenge:

    • keeping the OS and apps identical across the cluster nodes.

  • Configuration tools: ansible, puppet, chef, salt, cfengine.

14.3. Defining Configuration Management


At its broadest level, the management and maintenance of operating system and service configuration via code instead of manual intervention.

More formally:

  • Declaring the system state in a repeatable and auditable fashion and using tools to impose state and prevent systems from deviating

14.4. State


All system have a ‘state’ comprised of:

  • Files on Disk

  • Running services

State can be supplied by:

  • Installation / provisioning systems

  • Golden Images

  • Manual steps including direct configuration changes and setup scripts

14.5. Modern Configuration Management Features


  • Idempotency

    • Declaration and management of files and services to reach a ‘desired state’

  • Revision Control

    • Systems are managed with an ‘Infrastructure as code’ model

  • Composable and flexible

14.6. Benefits of configuration management


  • Centralized catalog of all system configuration

  • Automated enforcement of system state from an authoritative source

  • Ensured consistency between systems

  • Rapid system provisioning from easily-composed components

  • Preflight tests to ensure deployments generate expected results

  • Collection of system ‘ground truths’ for better decision making

14.7. Modern configuration-management systems


  • Puppet

    • Ruby based

  • Chef Infra

    • Ruby based

  • CFEngine

    • C based

  • Salt

    • Python based

  • Ansible

    • Python based

14.8. How Ansible works


  • Ansible connects to compute nodes via ssh as a regular user.

    • Needs either public-private key for the user running ansible, or host based authentication configured.

  • Forks several instances to ssh concurrently to multiple nodes.

  • Elevates root privileges via sudo.

    • Needs sudo privilege on the nodes for a user running ansible.

  • Runs configuration/management tasks via python modules.

    • Needs python3 as well as Ansible python modules (ansible-core) installed.

  • The tasks are defined in Ansible playbooks (yaml files).

    • The admin needs to understand yaml syntax for ansible tasks.

  • Doesn’t touch the system and configuration if they are already in the desired final target state.

14.9. A simple Ansible setup example


  • Ansible file structure for host name change on the nodes. All the files are under:

Ansible/
        |-- ansible.cfg
        |-- hosts.ini
        |-- fix_hostname_tasks.yml
        \-- files

14.10. The main config file, ansible.cfg on our cluster



[defaults]
inventory = hosts.ini
remote_user = hostadm
host_key_checking = false
remote_tmp = /tmp/.ansible/tmp
interpreter_python = /bin/python3
forks = 1

[privilege_escalation]
become = true
become_method = sudo
become_user = root
become_ack_pass = false

14.11. Inventory (hosts) file hosts.ini



[ceph_nodes]
node03

14.12. Playbook fix_hostname_tasks.yml


  • The file can have any name.

  • The extension is .yaml.

  • The configuration syntax is Yaml.

  • All the work is done by the tasks:

---
- name: fix hostname
  hosts: ceph_nodes

  tasks:

    - name: fix the hostname
      ansible.builtin.hostname:
        name: "{{ inventory_hostname }}"

    - name: fix hosts file
      ansible.builtin.copy:
        src: hosts
        dest: /etc/hosts

14.13. Ansible organization: playbook, play, role, and task.



  • Plays are associated with groups of hosts in the inventory.

  • Roles contain collections of reusable tasks.

  • Tasks perform all the work by utilizing modules.

14.14. Ansible modules


  • The modules are used by tasks to do work.

  • The most commonly used modules:

    • copy files: ansible.builtin.copy

    • set file attributes: ansible.builtin.file

    • install packages: ansible.builtin.dnf and ansible.builtin.apt

    • execute shell commands: ansible.builtin.shell

    • restart a service: ansible.builtin.service

    • modify a config file: ansible.builtin.lineinfile

    To see all installed on your system modules:

    ansible-doc -l
    

    Read the info on a specific module, for example, ansible.builtin.file:

    ansible-doc   ansible.builtin.file
    

14.15. Examples of utilizing ansible.builtin.shell module for a remote command



  • Use module ansible.builtin.shell for a remote command.

  • For example, check the status of sshd on all the nodes:

ansible all -m ansible.builtin.shell -a "systemctl status sshd"
  • Restart slurmd on compute1:

ansible compute1 -m ansible.builtin.shell -a "systemctl restart sshd"

14.16. Ansible playbook development steps



  • Set Configuration files: ansible.cfg and hosts.ini.

  • Identify groups of hosts to execute identical tasks on (plays)

  • Define the top-level playbook tasks (roles).

  • Add features you need in new yaml files.

  • Place configuration files and packages in the files directories for the roles.

  • Define variables (parameters) and put them in a variables file.

  • Tag the tasks for debugging purposes.

  • Check the playbooks for syntactic errors:

ansible-playbook playbook.yml --syntax-check
  • Perform a dry run:

ansible-playbook playbook.yml --check
  • List the tagged tasks:

ansible-playbook --list-tags playbook.yml
  • Run only the tagged task in the playbook with tag OSD, for example:

ansible-playbook --tags OSD playbook.yml
  • Run the playbook:

ansible-playbook playbook.yml

14.18. Questions and discussion: