Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SDD for Commodore Compile Pipeline #181

Merged
merged 11 commits into from
Jun 21, 2024
207 changes: 207 additions & 0 deletions docs/modules/SDDs/pages/0032-compile-pipeline.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
= SDD 0031 - Central Component Version tracking
bastjan marked this conversation as resolved.
Show resolved Hide resolved

:sdd_author: Aline Abler
:sdd_owner: Project Syn IG
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
:sdd_reviewers: Simon Gerber
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
:sdd_date: 2024-06-13
:sdd_status: draft
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved

include::partial$meta-info-table.adoc[]

[NOTE]
.Summary
====
This describes how we want to implement CI/CD for Project Syn using a Commodore compile pipeline, which enables automatic compilation of cluster catalogs whenever the corresponding tenant repository is modified.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved

Furthermore, it explains how want to extend the functionality of Lieutenant to enable it to automatically configure the compile pipeline on tenant repositories.
====

== Motivation

Having a continuous integration solution unlocks a number of benefits:
It solves the problem of configuration drift, where changes to the tenant repository might not be reflected in every cluster catalog because not all of them have since been compiled, and it lessens the burden on catalog maintainers who otherwise would need to locally compile each cluster individually.

It is already fairly straightforward to manually set up basic auto-compilation for individual tenant repositories without special support from Project Syn itself.
At VSHN, such a solution has been in use for several years.

However, certain features (such as automatic configuration of the compile pipeline) are hard to implement in a standalone fashion.
As such features are now desired, it makes sense to fully integrate the compile pipeline into Project Syn.

By making Project Syn "CI-aware", we can implement more seamless management of compile pipeline configuration on the tenant repositories, including automated setup and automated token rotation.
This will go hand-in-hand with the existing repository management features in Lieutenant.

=== Goals

* Provide pipeline definitions for GitLab CI/CD to automatically compile and push cluster catalogs on a tenant repository.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
* Enable Lieutenant to autonomously manage the configuration required to set up the compile pipeline for a cluster catalog.

=== Non-Goals

* Support for CI solutions other than GitLab CI/CD.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved

== Design Proposal

=== Requirements for Pipeline Configuration

Lieutenant imposes certain assumptions on the configuration of the pipeline:
Namely, the pipeline has to be set up on the tenant repository by way of adding (arbitrary) files to the repository, and it is configured through setting CI/CD variables on the repository.

In particular, Lieutenant configures the following CI/CD variables:

* `ACCESS_TOKEN_CLUSTERNAME`, where `CLUSTERNAME` is the name of a specific cluster, with `-` replaced by `_`.
This contains a GitLab Project Access Token, which must have read-write access the corresponding cluster's catalog repository.
* `COMMODORE_API_URL`. This contains the URL at which the Lieutenant API can be accessed.
* `COMMODORE_API_TOKEN`. This contains an access token for the Lieutenant API.
* `CLUSTERS`. This contains a space-separated list of cluster IDs which should be compiled and pushed automatically.

=== GitRepo CRD

We add two new fields to the `GitRepoTemplate` (and, by extension, the `GitRepo`) CRD, under the `.spec` key, called `accessTokenSecretName` and `ciVariables`.

The `accessTokenSecretName` field contains a reference to a secret.
If it is set, the Lieutenant operator will store an access token into this secret, which can be used to access the Git repository.
In the case of GitLab, this would be a Project Access Token with read-write access to the repository.

The `ciVariables` field contains a dictionary describing variable names and corresponding values.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
These variables are added to the git repository as CI/CD variables.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved

[source,yaml]
----
apiVersion: syn.tools/v1alpha1
kind: GitRepo
metadata:
name: my-repo
spec:
accessTokenSecretName: my-repo-access-token
ciVariables:
- name: COMMODORE_API_URL
value: ...
- name: COMMODORE_API_TOKEN
valueFrom:
secretKeyRef:
name: api-token-secret
key: token
----

=== Cluster CRD

We add a new field to the `Cluster` CRD, under the `.spec` key, called `enableCompilePipeline`.

The field contains a boolean flag, which controls whether the compile pipeline should be enabled or disabled for this cluster.

It is optional; not specifying it is equivalent to setting it to `false`.

[source,yaml]
----
apiVersion: syn.tools/v1alpha1
kind: Cluster
metadata:
name: c-my-cluster
spec:
enableCompilePipeline: true
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
----

=== Tenant CRD

We add a new field to the `Tenant` CRD, under the `.spec` key, called `compilePipeline`.

The `compilePipeline` field contains configuration pertaining to the automatic setup of the compile pipeline on the tenant repository.
It is optional.
Absence of the field disables automatic setup and management of the compile pipeline.

The `compilePipeline` field contains a dict with the following fields:

* `clusters`: List of cluster IDs of clusters for which the compile pipeline should be executed.
This field is managed by the operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move this to status.compilePipeline now that it's intended to be fully operator-managed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're going back to having the pipelineFiles, it doesn't make sense to fully move this into status anymore. But we might want to split it into configurable vs. operator managed fields and only put the latter in status. I'm not sure what's better practice - keep them together, or have a strict split between managed and user-defined?

Copy link
Member

@simu simu Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best practice is here, maybe @bastjan can chime in?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer fully operator managed fields in .status it removes a source of misunderstandings. Not a super firm position since Kubernetes also fully manages a lot of fields in .spec in the core manifests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've split it up now. This prompted me to also add an enabled field in the spec.


[source,yaml]
----
apiVersion: syn.tools/v1alpha1
kind: Tenant
metadata:
name: t-my-tenant
spec:
compilePipeline:
clusters:
- c-my-cluster
----

=== In-Repo CI/CD pipeline configuration

Configuring the CI pipeline usually happens through files committed to the corresponding repository.
For a Lieutenant-managed pipeline configuration, these files should be managed by Lieutenant.
To achieve this, we can leverage the existing mechanism to commit template files to git repositories:
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved

[source,yaml]
----
apiVersion: syn.tools/v1alpha1
kind: Tenant
metadata:
name: t-my-tenant
spec:
gitRepoTemplate:
templateFiles:
.gitlab-ci.yml: |
include:
- project: syn/commodore-compile-pipeline
ref: master
file: /.gitlab/commodore-common.yml

----


=== Operator

The Lieutenant Operator will be extended to automatically manage the compile pipeline for repositories where this is enabled (by way of configuring the `compilePipeline` field on the tenant and the `enableCompilePipeline` field on the cluster).
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved

Since the compile pipeline has to interact with both the tenant repository as well as the cluster catalog repositories, it must be enabled on both corresponding resources for the configuration to be functional.
This way, it is possible to enable auto-compilation for some, but not all clusters on a tenant.

The operator will reconcile *GitRepos* as follows:

* When `spec.accessTokenSecretName` is set, the operator generates an access token for the corresponding repository (via the repository host's API, using the API secret in `.spec.apiSecretRef`), and writes this token into a secret with the given name.
In the case of GitLab, this is a Project Access Token.
The operator also runs a scheduled job which refreshes these tokens when they are close to expiring, or when they no longer exist on the repository host.
* The content of `.spec.ciVariables` is written to the repository's configuration on the git host.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
In the case of GitLab, it is written as CI/CD variables.

NOTE: If the GitRepo is of type `unmanaged`, none of these steps will be executed.

The operator will reconcile *Clusters* as follows:

* When `.spec.enableCompilePipeline` is set to `true`, the tenant's `spec.compilePipeline.clusters` is updated to contain the cluster ID.
* Similarly, when the field is set to `false` or missing, the tenant's `spec.compilePipeline.clusters` is updated to not contain the cluster ID.

The operator will reconcile *Tenants* as follows:

* When `.spec.compilePipeline` exists and isn't empty, the following entries are added to the tenant repository GitRepo's `.spec.ciVariables`:
** `COMMODORE_API_URL`, containing the URL at which the Lieutenant API can be accessed.
** `COMMODORE_API_TOKEN`, containing a reference to the secret which contains the tenant's access token for the Lieutenant API.
** `CLUSTERS`, containing a space-separated list of cluster IDs taken directly from `.spec.compilePipeline.clusters`.
* For each entry in `.spec.compilePipeline.clusters`, another entry is added to the tenant repository GitRepo's `spec.ciVariabes`.
The key is `ACCESS_TOKEN_CLUSTERNAME`, where `CLUSTERNAME` is the ID of a specific cluster, with `-` replaced by `_`.
The value is a reference to the secret containing the access token to access that cluster's catalog repository, taken from the secret specified in the catalog GitRepo configuration under `.spec.accessTokenSecretName`.

=== Implementation Details/Notes/Constraints

Currently, we're looking at a solution that is specific to GitLab CI/CD, and to tenant and catalog repositories that are stored on GitLab.
Other CI solutions and other repository hosts might be supported in the future.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved

Existing compile pipeline configuration::
If a setup already includes a bunch of tenant repositories with manually configured CI/CD, some care has to be taken to ensure the new implementation can "adopt" this configuration.
bastjan marked this conversation as resolved.
Show resolved Hide resolved
+
In particular, these repositories would already have a working `.gitlab-ci.yml` that probably can be left as-is.
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
+
Any existing manually created Project Access Tokens will be superseded by new auto-generated ones.
This will lead to a bunch of now-unused tokens needing to be cleaned up.

External Catalog Repositories::
There may be cases where the catalog repositories are not hosted on the same repository host as the tenant repository, in which case API access for the purpose of creating Project Access Tokens is unavailable.
The Commodore Compile Pipeline can still be used against such catalog repositories by specifying an SSH key to access them.
+
This can still be configured manually, and the automated configuration would not interfere.

== References

* https://docs.gitlab.com/ee/ci/variables/[GitLab CI Variables]
* https://docs.gitlab.com/ee/user/project/settings/project_access_tokens.html[GitLab Project Access Tokens]
1 change: 1 addition & 0 deletions docs/modules/SDDs/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@ The all are using the xref:sdd-template.adoc[SDD Template].
* xref:0028-reusable-config-packages.adoc[0028 - Reusable Commodore Component Configuration Packages]
* xref:0030-argocd-multitenancy.adoc[0030 - Project Syn ArgoCD Multi-Tenant Support]
* xref:0031-component-version-tracking.adoc[0031 - Central Component Version tracking]
* xref:0032-compile-pipeline.adoc[0032 - Commodore Compile Pipeline]
HappyTetrahedron marked this conversation as resolved.
Show resolved Hide resolved
Loading