Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
220 changes: 220 additions & 0 deletions technotes/templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
# Cluster templates
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these template docs include instructions for actually creating cluster request with that new template? Like here: https://gist.github.com/larsks/d90063a98cb7733a5997058d768eed48#creating-a-cluster. Or will that be located in different section of the docs?


## What is a cluster template?

A cluster template is an [Ansible role] that is responsible for installing and configuring an OpenShift cluster. Cluster templates are distributed in Ansible [collections]. A collection may contain multiple templates.

[collections]: https://docs.ansible.com/ansible/latest/dev_guide/developing_collections.html
[ansible role]: https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_reuse_roles.html

## How to create a cluster template

### Create a collection

If you don't already have a collection for distributing your cluster templates, you can create a new one using the `ansible-galaxy collection init` command. Collection names consist of two parts, a namespace (often a company name) and a name, so if we wanted to create a new template named `redhat.rhoai`, we would run:

```
ansible-galaxy collection init redhat.rhoai
```

That creates a directory tree that looks like:

```
redhat/
└── rhoai
├── docs
├── galaxy.yml
├── meta
│   └── runtime.yml
├── plugins
│   └── README.md
├── README.md
└── roles
```

### Create a role

A cluster template role consists of at least two playbooks:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't delete.yaml and install.yaml also required?


- `tasks/main.yaml` -- this playbook runs with privileges on the management cluster. It is responsible for creating the cluster when `cluster_template_state` is `present` and for deleting the cluster when `cluster_template_state` is `absent`.

- `tasks/post-install.yaml` -- this playbook runs with privileges on the tenant cluster. It is responsible for performing any software installation and configuration tasks on the tenant cluster.

A template role also contains two metadata files:

- `meta/argument_specs.yaml` -- this file describes all the parameters accepted by the role.

- `meta/cloudkit.yaml` -- this file contains metadata such as a friendly title and a description.

### Writing argument_specs.yaml

The syntax of the `argument_spec.yaml` file is fully documented in the [Ansible documentation][argspec]. Your role will need to accept a `cluster_order` parameter with the values from the cluster request, and a `template_parameters` parameter with any values required by your role (or by roles to which you call out). A basic example looks like this:

```yaml
argument_specs:
main:
options:
cluster_order:
type: dict
required: true
cluster_working_namespace:
type: str
required: true
node_requests:
type: list
elements: dict
options:
numberOfNodes:
type: int
required: true
resourceClass:
type: str
required: true
choices:
- fc430
template_parameters:
type: dict
options:
pull_secret:
description: >
The pull secret contains credentials for authenticating to image repositories.
type: str
required: true
ssh_public_key:
description: >
A public ssh key that will be installed into the `authorized_keys` file
of the `core` user on cluster worker nodes.
type: str
```

[argspec]: https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_reuse_roles.html#role-argument-validation

### Writing cloudkit.yaml

The `cloudkit.yaml` file contains metadata about your template role that isn't part of the standard Ansible metadata. Here is a sample `cloudkit.yaml`:

```yaml
# Define the display name and description of the template
title: Simple OpenShift 4.17 Cluster
description: >
This template will build a minimally configured OpenShift 4.17 cluster.

# This sets the nodeRequest field in the ClusterOrder manifest. There must be at
# least one entry in this list.
default_node_request:
- resourceClass: fc430
numberOfNodes: 2

# This declares that for a cluster built from this template, only the resource
# classes listed here may be used when requesting additional nodes. If this
# list is empty or unspecified, a ClusterOrder may request nodes from any
# available resource class.
allowed_resource_classes:
- fc430
```

### Example main.yaml

In most cases, your `main.yaml` will simply include roles distributed as part of the OSAC installer. A typical example would be:

```
- name: Create a new instance of this template
when: cluster_template_state == 'present'
ansible.builtin.import_role:
name: cloudkit.templates.ocp_4_17_small

- name: Destroy an instance of this template
when: cluster_template_state == 'absent'
ansible.builtin.import_role:
name: cloudkit.templates.ocp_4_17_small
```

### Example post-install.yaml

The `post-install.yaml` task list runs with credentials for the tenant cluster and will typically consist of many calls to the [`kubernetes.core.k8s`](https://docs.ansible.com/ansible/latest/collections/kubernetes/core/k8s_module.html) module:

```
- name: Create grafana namespace
kubernetes.core.k8s:
state: present
definition:
apiVersion: v1
kind: Namespace
metadata:
name: grafana
```

You can leverage [Kustomize] to deploy resources by using the [`kubernetes.core.kustomize`](https://docs.ansible.com/ansible/latest/collections/kubernetes/core/kustomize_lookup.html) lookup:

```
- name: Install Operators
kubernetes.core.k8s:
state: present
definition: "{{ lookup('kubernetes.core.kustomize', dir = kustomize_dir ~ '/operators') }}"
validate_certs: false
```

## Using a cluster template

There are two steps to making your collection available to the OSAC installer:

1. Build a new [execution environment] that includes your collection.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this step should include the steps for building execution environment from github actions. Building from the github actions that was already constructed, to me, was the most straightforward way to build the execution environment. Mostly because it handles building AND pushing the image.

2. Configure the `cloudkit_template_collections` variable to include the name of your collection

[execution environment]: https://docs.ansible.com/ansible/latest/getting_started_ee/index.html

### Build a new execution environment

1. Check out the `cloudkit-aap` repository:

```
git clone https://github.com/innabox/cloudkit-aap && cd cloudkit-aap
```

2. Edit `collections/requirements.yml` to include your new collection. If your collection has been published to Ansible Galaxy, you can use the fully qualified collection name:

```
- name: redhat.rhoai
version: 1.0.0
```

If your collection is hosted in a Git repository, provide the URL to the repository:

```
- source: https://github.com/innabox/redhat-rhoai-collection.git
type: git
version: 1.0.0
```

If you are not tagging versions in your repository, you can use a commit hash instead of a tag.

3. Build a new execution environment:

```
cd execution-environment
ansible-builder build --tag cloudkit-aap-ee
```

4. Publish the container image to a container registry:

```
podman push cloudkit-aap-ee quay.io/myusername/cloudkit-aap-ee:latest
```

### Configure the installer to use the new image
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part isn't up to date, it should look like this now:


secretGenerator:
  - name: config-as-code-ig
    options:
      disableNameSuffixHash: true
    literals:
      - AAP_EE_IMAGE=ghcr.io/isaiahstapleton/cloudkit-aap:678b9a1514ee800b0429569cbec30a928b167f7e

configMapGenerator:
  - name: publish-templates-ig
    options:
      disableNameSuffixHash: true
    literals:
      - CLOUDKIT_TEMPLATE_COLLECTIONS=cloudkit.templates,isaiahstapleton.ai_cluster_ansible



Configure AAP to use the new image as the execution environment and to look for templates in your Ansible collection.. You will need to set the `AAP_EE_IMAGE` key in the `config-as-code-ig` Secret to the fully qualified image name. For example, if you are deploying using the installer repository, you could add the following to your `kustomization.yaml`:

```
secretGenerator:
- name: config-as-code-ig
options:
disableNameSuffixHash: true
literals:
- AAP_EE_IMAGE=quay.io/myname/cloudkit-aap-ee@sha256:ddc3dec0315de244c14df9d2ee11cc242a60dabd0b0b43b2c48fd1a0de2e61da
- CLOUDKIT_TEMPLATE_COLLECTIONS=cloudkit.templates,redhat.rhoai
```

Now, re-deploy the installation manifests.