Agent Provisioning

Facts

Git Repo: https://github.com/erigrid2/riasc-provisioning
State: implementation finished

Introduction

The provisioning component is responsible for quickly adding new nodes to the RIasC cluster with minimal effort. It does so by generating pre configured system OS images (e.g. Raspberry Pi SD card) for various environments.

New nodes can be automatically provisioning by booting them with such an image. Users only provide minimal configuration such as the IP addresses, host names or access tokens for registration to a RIasC cluster.

Employed technologies

Architecture

The following figure shows the flow for generating and booting an pre-generated SD card image for Raspberry Pis.

User Roles

We distinguish two groups of users or roles within a RIasC cluster:

Cluster operator: The cluster operator oversees the administrative tasks of setting up a RIasC cluster and its central components (K3S server, Ansible Repo, Access Control)
Lab provider: The lab provider participates in the RIasC cluster by providing one or more agent nodes which join the cluster.

Provisioning Process

There are three steps required for provisioning new nodes:

Generation of a new system image

This step is usually only performed once by the cluster operator. It generates the new system image by embedding a special script named riasc-update.sh into the image which gets executed during every system boot.

Configuration

Once the system image has been generated by the cluster operator and distributed to the lab providers, the lab providers need to customize the image by providing a few custom configuration settings like the agent host name, IP address or access tokens required for joining the cluster.

These settings are provided in a YAML configuration file. In the case of a Raspberry Pi agent, this configuration resides on the FAT32-formatted boot partition and is therefore easily editable by Windows users as well.

First boot

Once the configuration file has been edited by the user, the new agent can be booted for the first time by the lab provider. During this initial and all subsequent boots of the node a shell script named riasc-update.sh is executed. This shell script performs some initial checks, install some required software packages and performs some first basic configuration tasks like the setup of the system host name.

All further configuration is then handled by Ansible. Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code. It runs on many Unix-like systems, and can configure both Unix-like systems as well as Microsoft Windows. It includes its own declarative language to describe system configuration.

The riasc-update.sh script uses the ansible-pull command for fetch the system configuration from a remote Git repository. Once fetched, the system configuration is applied and the new agent node is joined the K3s Kubernetes cluster.

Subsequent boots

Once the initial boot has been completed, the provisioning process is completed. The cluster operator can now apply further updates to the agent updating the Ansible repository and triggering a reboot of the node. Such a reboot will cause the riasc-update.sh script to be executed again and reapply the now possibly updated configuration.

Implementation details

Security considerations

Having embedded devices in a securely guarded lab-environment automatically executing code fetched from a remote repository is undoubtedly a security risk which needs to be mitigated. Hence, the ansible-pull command checks the fetch Git commit for an embedded PGP signature which is verified against a static white-list of PGP keys of the cluster operator.

Facts​

Introduction​

Employed technologies​

Architecture​

User Roles​

Provisioning Process​

Generation of a new system image​

Configuration​

First boot​

Subsequent boots​

Implementation details​

Security considerations​

Further Reading​