Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed the operator to gracefully degrade when not on cluster-wide scope #916

Merged
merged 1 commit into from
Feb 26, 2020

Conversation

jpkrohling
Copy link
Contributor

@jpkrohling jpkrohling commented Feb 19, 2020

This is a first draft in getting the operator to work in a single namespace, without cluster roles. This supersedes #830 and fixes #791. It also fixes #905.

This PR also includes instrumentation of the reconciliation for deployments and namespaces, as I added them in order to better understand what needed fixing.

Signed-off-by: Juraci Paixão Kröhling juraci@kroehling.de

pkg/autodetect/main.go Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@jpkrohling jpkrohling changed the title WIP - Changed the operator to gracefully degrade when not on cluster-wide scope Changed the operator to gracefully degrade when not on cluster-wide scope Feb 20, 2020
@jpkrohling jpkrohling force-pushed the StreamlinePermissions branch 2 times, most recently from e3d9307 to af1d9c6 Compare February 20, 2020 14:57
@codecov
Copy link

codecov bot commented Feb 20, 2020

Codecov Report

Merging #916 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #916      +/-   ##
==========================================
- Coverage   64.04%   64.01%   -0.03%     
==========================================
  Files          79       79              
  Lines        6427     6423       -4     
==========================================
- Hits         4116     4112       -4     
  Misses       2173     2173              
  Partials      138      138
Impacted Files Coverage Δ
pkg/inject/sidecar.go 96.75% <0%> (-0.09%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d80bd24...b1d31e1. Read the comment docs.

@jpkrohling jpkrohling changed the title Changed the operator to gracefully degrade when not on cluster-wide scope WIP - Changed the operator to gracefully degrade when not on cluster-wide scope Feb 20, 2020
@jpkrohling
Copy link
Contributor Author

I'm marking this as WIP, as I want to confirm whether the ClusterRole as we have it here is compatible with what OLM expects.

@jpkrohling
Copy link
Contributor Author

Testing this, I found out something interesting that I kinda mentioned on #791 / #830 but wasn't able to find definitive documentation about it: the WATCH_NAMESPACE can be a comma-separated list of values:

Excerpt:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: jaeger-operator
  namespace: other-observability
spec:
  template:
    metadata:
      annotations:
        olm.operatorGroup: other-observability-ntvzv
        olm.operatorNamespace: other-observability
        olm.targetNamespaces: observability,other-observability
    spec:
      containers:
      - args:
        - start
        env:
        - name: WATCH_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.annotations['olm.targetNamespaces']

I couldn't find a way to do it via the web interface, but I can provision an operator via OLM's UI and then later manually change the operator group (kubectl edit operatorgroup other-observability-ntvzv) so that it looks like this:

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  annotations:
    olm.providedAPIs: Jaeger.v1.jaegertracing.io
  creationTimestamp: "2020-02-20T16:27:24Z"
  generateName: other-observability-
  generation: 2
  name: other-observability-ntvzv
  namespace: other-observability
  resourceVersion: "129022"
  selfLink: /apis/operators.coreos.com/v1/namespaces/other-observability/operatorgroups/other-observability-ntvzv
  uid: a05ae72f-e405-4728-b01c-45deb2387e9a
spec:
  targetNamespaces:
  - other-observability
  - observability
status:
  lastUpdated: "2020-02-20T16:29:23Z"
  namespaces:
  - other-observability
  - observability

@jpkrohling
Copy link
Contributor Author

Wanted to record another finding here: OLM will create a new ClusterRoleBinding between the operator's service account and every ClusterRole defined in the CSV. The Role and ClusterRole present in the deploy/role.yaml are the sources for this. So, we really can't use ClusterRole definitions as @pavolloffay mentioned before in this PR. This also means that, in the current iteration, we can't have an operator instance installed via OLM optionally without ClusterRole permissions: we either remove all cluster roles, or we keep it and they will always be requested.

image

I also tried to create an operator with a regular user, but looks like not all ordinary users can even list operators. I need to do some more research on this part.

image

@jpkrohling jpkrohling force-pushed the StreamlinePermissions branch 2 times, most recently from 7d1729d to f6abd51 Compare February 21, 2020 16:38
@pavolloffay
Copy link
Member

So, we really can't use ClusterRole definitions as @pavolloffay mentioned before in this PR

do you mean in #916 (comment)?

I would be interested to know the motivation behind this PR. There might be two user bases. The OpenShift user base using the OLM might not care much about the permission as it seems OLM is using the ClusterRoles anyways. However the vanilla k8s user base might care more about the RBAC as they are not using the OLM and they install the operator explicitly.

@jpkrohling
Copy link
Contributor Author

Yes, that's the one I meant.

There might be two user bases. The OpenShift user base using the OLM might not care much about the permission as it seems OLM is using the ClusterRoles anyways. However the vanilla k8s user base might care more about the RBAC as they are not using the OLM and they install the operator explicitly.

#791 is an example of this second case, but there's a third as well: corporations that are using OpenShift+OLM but are hesitant in using operators that require cluster-wide permissions.

@jpkrohling jpkrohling force-pushed the StreamlinePermissions branch 4 times, most recently from 6d8eb73 to 50b3621 Compare February 24, 2020 14:15
@jpkrohling jpkrohling changed the title WIP - Changed the operator to gracefully degrade when not on cluster-wide scope Changed the operator to gracefully degrade when not on cluster-wide scope Feb 24, 2020
@jpkrohling
Copy link
Contributor Author

This is now ready for the next review.

cc @objectiser, @pavolloffay, @rubenvp8510

deploy/operator.yaml Show resolved Hide resolved
go.sum Outdated Show resolved Hide resolved
pkg/apis/jaegertracing/v1/const.go Show resolved Hide resolved
@objectiser
Copy link
Contributor

@jpkrohling Would it be possible to summarise the implications of this PR, e.g. in the context of the CR used by Service Mesh with oauth proxy config for Kiali?

README.md Show resolved Hide resolved
deploy/cluster_role.yaml Show resolved Hide resolved
pkg/autodetect/main_test.go Show resolved Hide resolved
@jpkrohling
Copy link
Contributor Author

Would it be possible to summarise the implications of this PR, e.g. in the context of the CR used by Service Mesh with oauth proxy config for Kiali?

There should be no changes to consumers of the operator via OLM, as it will install the cluster roles as well. A separate CSV could be provided without the cluster permissions, but it's not part of this PR.

There are also no changes to the CR, everything that worked before should work now.

…cope

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
@jpkrohling jpkrohling merged commit 7531c6b into jaegertracing:master Feb 26, 2020
@jpkrohling
Copy link
Contributor Author

Merged, as both @objectiser and @pavolloffay were OK, with restrictions only around the CI passing (it did) and docs (jaegertracing/documentation#370)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autoinject routines scan all namespaces is clusterrole required for Jaeger operator?
7 participants