Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Enable gitops usage on management clusters #2

Merged
merged 4 commits into from
Jul 12, 2021
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions 0002-gitops-management-clusters/0002-gitops-management-clusters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# RFC 0002 - Enable customers to use gitops in management clusters

We currently give customers access to the management clusters but do not support them in utilizing this access effectively in terms of git ops related management (e.g. for apps).

Context:
Some customers might be interested in utilizing mainstream git ops tooling in order to manage their apps across clusters and installations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main question for me:

What does it help our customers achieve? Is it, "Manage stacks of apps on fleets of clusters" and challenges related to that?

(If it is, then this is the thing Batman wants to help customers achieve. And I'd like to explore this opportunity to take this next step in that direction.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is one approach which could achieve the goal of "Manage stacks of apps on fleets of clusters".

I would appreciate Batman input & opinions here :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cokiengchiara Yes this tooling would help customers to manage the app CRs and user values configmaps and secrets for their apps. With stacks of apps on fleets of clusters that becomes more complex.

There is a lot of content out there on the benefits the GitOps approach provides. I don't want to repeat that but maybe https://www.gitops.tech/ helps as an overview / starting point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with "Manage stacks (or fleets) of apps on fleets of clusters", and to be clear that means it is not just app CRs that get managed, but all kinds of YAML that a customer can and should create on the MCs. So I would generalize this a bit in two directions:

  1. it's not about apps, it's about the Management API (MAPI)
  2. it's not about gitops per se, it is about enabling tooling against the MAPI that needs "an agent" in the MC and cannot just externally work against the MAPI.


## Open Questions

### 1. General direction
1.1 We would like to allow our customers to choose their management tooling freely. Do we want to support some tools with e.g. a managed app first though?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to let customers choose their tooling freely but I think we need some validation so only a subset of apps can be installed in management clusters.

Starting with a managed app first makes sense to me. Especially if its a tool we choose to use ourselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that managed apps makes sense. It's also a question of how much freedom we give customers / how much we allow them to "fuck up".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be that we limit this for the beginning to there's this selection of apps (at the get go 1) that you can install on your MC using our interface. very controlled.

Later, it could be extended to having nicely isolated namespaces in the MC where we can give a certain amount of more freedom, but even there because of the limits of isolation we need to be very careful and we might never get there completely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it's not only about "fuckign up" it might also be that there's cases of actual multi-tenancy within an MC, where there's an untrusted org


1.2 Do we own the gitops tooling or is it purely owned by the customer?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this could be similar to other managed apps, they might start in a low support level and move up if we feel confident, some might never be mananged


### 2. Technical issues
2.1 How does the setup for an in-cluster agent look like? Currently customers will struggle setting up an in-cluster agent with appropriate permissions.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the app CR the customer just needs to set .spec.kubeConfig.inCluster to true.

We should also make our kubeconfigs work with GitOps tooling. They don't currently work with Flux because of how we name the secrets we generate in cluster-operator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this was bad phrasing on my part:
This more refers to the initial agent set-up (at least as long as we don't have a managed app). We will currently struggle to not give the agent full permissions to all namespaces, right?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This more refers to the initial agent set-up (at least as long as we don't have a managed app). We will currently struggle to not give the agent full permissions to all namespaces, right?

Ah got it. Yes I think we'll need to restrict the permissions. Taking Flux as an example AIUI there are multi tenancy patterns with a system Flux for platform admins and then multiple Flux's for teams.

My proposal would be we follow that approach and offer a managed app with the locked down setup. This is another argument for offering managed apps for this IMO.


2.2 How do we ensure security requirements when customer interaction increases?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a multi tenancy problem with chart CRs and their related configmaps and secrets.

These are currently stored in the giantswarm namespace. Before enabling customers to create apps in management clusters I think we should move this to their organization or cluster namespace.


2.3 Do we foresee issues when introducing gitops tooling to already existing resources?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends how invasive the tooling is, like does it delete and recreate resources or does it "adopt" and apply/update

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This needs to be checked for each GitOps tool separately if resources changed by happa or gsctl will be changed back to the state the GitOps tool knows.

Random idea: At least flux applies labels to ressourced managed by it, so we could display (for example) clusters installed through the GitOps tool as read-only


### 3. Guidance
3.1 How do we support customers in making sensible decision in terms of tooling choice with gitops?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piontec was proposing a SPIKE within balo looking at the alternatives and forming a first opnion, as currently GS has no or only very low informed opinions it feels


3.2 Do we aid customers with repository structure for their gitops approach?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is sth we believe should be used, we could or rather should accompany the offering with docs as well as tutorials or workshops that educate the users. This is really nice in the context of moving Room 1 customer to Room 2 for example.


3.3 Do we offer customers to give us shared access to their repository for additional review?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting Q, for me especially interesting if we could find a mutual interest there, as we already often have this between us and customers, where we also hate up waking up because of misconfiguration so we have an incentive to help out and avoid bad config. Also, interesting how that would play into our new Solutions Engineering approaches. cc @giantswarm/se

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we publish some guidelines (previous point) I do think it is fine checking current existing or future repo structure and whole design as part of SE consultancy work