Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump rancher-monitoring chart to 102.0.0+up40.1.2 #492

Merged
merged 1 commit into from
Jun 6, 2023

Conversation

w13915984028
Copy link
Member

@w13915984028 w13915984028 commented May 17, 2023

bump monitoring to 102.0.0+up40.1.2

  patch nginx-config file with values from 102.0.1+up40.1.2

  fix the tar time checking warning

  fetch rancher/shell v0.1.18, both rancher-monitoring-crd and rancher-monitoring are using it

Signed-off-by: Jian Wang <w13915984028@gmail.com>

Problem:

The embedded rancher-monitoring in Harvester is out of dated.

Solution:

bump rancher-monitoring chart to` 102.0.0+up40.1.2

Related Issue:
harvester/harvester#3780

Test plan:

note: there is an UI issue to solve, DO NOT MERGE before it is solved. ( https://github.com/harvester/harvester/issues/3780#issuecomment-1550889891 )

cc @DaiYuzeng @WuJun2016 @guangbochen

update at 20230522:

In https://charts.rancher.io/index.yaml, there is assets/rancher-monitoring/rancher-monitoring-102.0.0+up40.1.2.tgz, but no 102.0.1+up40.1.2 , which is not formally released yet

And, in 102.0.0+up40.1.2, there is such a bug, that is fixed in 102.0.1+up40.1.2

https://github.com/rancher/charts/blob/dev-v2.7/charts/rancher-monitoring/102.0.1%2Bup40.1.2/charts/grafana/templates/nginx-config.yaml

          {{- if eq .Values.global.cattle.clusterId "local" -}}
          sub_filter '"appSubUrl":""' '"appSubUrl":"/api/v1/namespaces/{{ template "grafana.namespace" . }}/services/http:{{ template "grafana.fullname" . }}:{{ .Values.service.port }}/proxy"';
          {{- else -}}
          sub_filter '"appSubUrl":""' '"appSubUrl":"/k8s/clusters/{{ .Values.global.cattle.clusterId }}/api/v1/namespaces/{{ template "grafana.namespace" . }}/services/http:{{ template "grafana.fullname" . }}:{{ .Values.service.port }}/proxy"';
          {{- end -}}

upgrade test: refer harvester/harvester#3380

update at 20230522:

the nginx-config is patched, and now rancher-monitoring works with embedded mode in Harvester.

Test plan:

  1. install new harvester with this PR
  2. enable rancher-monitoring addon from Harvester UI
  3. wait a few minutes, check cluster metrics, vm metrics on dashboard are avialable, and can click grafana to open embedded grafana page

@w13915984028
Copy link
Member Author

w13915984028 commented May 17, 2023

A Vagrant failure "xorriso : FAILURE : Image size 2671088s exceeds free space on media 1129470s", , is waiting to be solved.

solved.

Copy link
Contributor

@guangbochen guangbochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please help to check to disable PSP in the monitoring charts; otherwise after we bump the RKE2 to v1.25 it will contain the follow issue:

 mcc-rancher-monitoring-crd                    0/1                       ErrApplied(1) [Cluster fleet-local/local: unable to build kubernetes objects from release manifest: resource mapping not found for name: "rancher-monitoring-crd-manager" namespace: "cattle-monitoring-system" from "": no matches for kind "PodSecurityPolicy" in v…

@w13915984028
Copy link
Member Author

w13915984028 commented May 25, 2023

please help to check to disable PSP in the monitoring charts; otherwise after we bump the RKE2 to v1.25 it will contain the follow issue:

 mcc-rancher-monitoring-crd                    0/1                       ErrApplied(1) [Cluster fleet-local/local: unable to build kubernetes objects from release manifest: resource mapping not found for name: "rancher-monitoring-crd-manager" namespace: "cattle-monitoring-system" from "": no matches for kind "PodSecurityPolicy" in v…

the psp is disabled by default in rancher-monitoring-crd in 102.0.0+up40.1.2

https://github.com/rancher/charts/blob/69decabd4eb77c77d008d3a33c6c19d2da73ef5d/charts/rancher-monitoring-crd/102.0.0%2Bup40.1.2/values.yaml#L8

# Default values for rancher-monitoring-crd.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

global:
  cattle:
    psp:
      enabled: false
    systemDefaultRegistry: ""

But in the current 100.1.0+up19.0.3, the templates/rbac.yaml has no check of global.cattle.psp.enabled, it will cause above error.
100.1.0+up19.0.3 vs 102.0.0+up40.1.2
image

@guangbochen
Conclusion:
with RKE2 v1.25, we need the bump of rancher-monitoring-crd to 102.0.0+up40.1.2, there is no such error no matches for kind "PodSecurityPolicy" , it was disabled by default in new version.

@guangbochen
Copy link
Contributor

guangbochen commented May 25, 2023

that's great, I think we are aiming to bump the monitoring chart to the latest version of 102.0.1+up40.1.2, and currently is pending on the release from the Rancher side. cc @bk201

@w13915984028
Copy link
Member Author

that's great, I think we are aiming to bump the monitoring chart to the latest version of 102.0.1+up40.1.2, and currently only pending on the release from the Rancher side. cc @bk201

Before rancher release a new one, we can use this PR to validate ealier. I compared 102.0.1+up40.1.2 and 102.0.0+up40.1.2, other changes are not affecting us. We only need the broken nginx-config.

@w13915984028
Copy link
Member Author

upgrade rancher-monitoring to 102.0.0+up40.1.2:

harvester/harvester#3380 (comment)

@w13915984028
Copy link
Member Author

Due to 52cdabc is merged, following error is coming, need this PR #492 asap.

get bundle -A
NAMESPACE     NAME                                          BUNDLEDEPLOYMENTS-READY   STATUS
..                    
fleet-local   mcc-rancher-logging-crd                       1/1                       
fleet-local   mcc-rancher-monitoring-crd                    0/1                       ErrApplied(1) [Cluster fleet-local/local: unable to build kubernetes objects from release manifest: resource mapping not found for name: "rancher-monitoring-crd-manager" namespace: "cattle-monitoring-system" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"...

Copy link
Contributor

@guangbochen guangbochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and verified that on the new cluster, dashboards are all showing correctly, thanks.
image

@bk201
Copy link
Member

bk201 commented Jun 6, 2023

See ImagePullBackOff on this pod in an air-gapped env:

  • rancher-monitoring-crd-create-f9sch
    • rancher/shell:v0.1.18

@w13915984028
Copy link
Member Author

See ImagePullBackOff on this pod in an air-gapped env:

  • rancher-monitoring-crd-create-f9sch

    • rancher/shell:v0.1.18

Ops, it was updated in 74675d6 and later,

image

monitoring in 100.1 and 102. are using the same shell version

let me update the shell version

  patch nginx-config file with values from 102.0.1+up40.1.2

  fix the tar time check warnng

  fetch rancher/shell v0.1.18, both rancher-monitoring-crd and rancher-monitoring are using it

Signed-off-by: Jian Wang <w13915984028@gmail.com>
@w13915984028
Copy link
Member Author

added rancher/shell:v0.1.18 back, both rancher-monitoring-crd and rancher-monitoring are using this version, even though we may change it in values.yaml to v0.1.19, but now I do not have time to check the difference between v0.1.18 and v0.1.19, just stick to the version defined by chart. @bk201

make log:

docker image pull --quiet docker.io/rancher/shell:v0.1.8
docker.io/rancher/shell:v0.1.8
docker image pull --quiet docker.io/rancher/shell:v0.1.18
docker.io/rancher/shell:v0.1.18
docker image pull --quiet docker.io/rancher/shell:v0.1.19
docker.io/rancher/shell:v0.1.19

Copy link
Member

@bk201 bk201 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bk201 bk201 merged commit 9d9576c into harvester:master Jun 6, 2023
@guangbochen
Copy link
Contributor

@mergify backport v1.2

@mergify
Copy link

mergify bot commented Jun 6, 2023

backport v1.2

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants