Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: delete tenant with prometheus's config been deleted #1773

Merged
merged 1 commit into from
Oct 7, 2023

Conversation

jiuker
Copy link
Contributor

@jiuker jiuker commented Sep 13, 2023

delete tenant with prometheus's config been deleted.
Now if tenant enable the prometheus. Config will be left if the tenant have been deleted.

@shtripat
Copy link
Contributor

@jiuker if any steps for verification, can you please add here?

@jiuker
Copy link
Contributor Author

jiuker commented Sep 14, 2023

@jiuker if any steps for verification, can you please add here?

@shtripat here the steps:

  1. Start tentant with .spec.prometheusOperator: true .
  2. Wait the tenant ready.
  3. Delete the tenant.
  4. check the secret minio-prom-additional-scrape-config, the data prometheus-additional.yaml will remove the tenant job.

Copy link
Contributor

@shtripat shtripat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Verified following below steps

  1. Deployed a kind cluster
  2. Deployed prometheus operator
  3. Deployed minio opertaor with this branch build
  4. Created a tenant and verified the content of secret as k get secrets/minio-prom-additional-scrape-config -oyaml | yq '.data."prometheus-additional.yaml"' | base64. Shows the prometheus scrape job details
  5. Deleted the tenant now
  6. Re-verified the secret again using k get secrets/minio-prom-additional-scrape-config -oyaml | yq '.data."prometheus-additional.yaml"' | base64. And it shows empty value

@pjuarezd
Copy link
Member

pjuarezd commented Oct 7, 2023

Can you describe a little bit more of why is it needed and when is it needed please? that will make easier to document in release notes @jiuker

@jiuker
Copy link
Contributor Author

jiuker commented Oct 7, 2023

Can you describe a little bit more of why is it needed and when is it needed please? that will make easier to document in release notes @jiuker

When someone delete the tenant with prometheus enable. It will left the prometheus-job without this pr. Prometheus will get a offline target to get data always. @pjuarezd

@pjuarezd
Copy link
Member

pjuarezd commented Oct 7, 2023

Can you describe a little bit more of why is it needed and when is it needed please? that will make easier to document in release notes @jiuker

When someone delete the tenant with prometheus enable. It will left the prometheus-job without this pr. Prometheus will get a offline target to get data always. @pjuarezd

allright!, I get it now, cc @feorlen

@pjuarezd pjuarezd merged commit 11ccc22 into minio:master Oct 7, 2023
24 checks passed
@harshavardhana
Copy link
Member

@pjuarezd the bug here is that we leave a stale configuration in Prometheus operator even after tenant deletion, this causes the next attempt to create leads to stale information to be picked up.

That ends up not talking to Prometheus.

@jiuker
Copy link
Contributor Author

jiuker commented Oct 7, 2023

@pjuarezd the bug here is that we leave a stale configuration in Prometheus operator even after tenant deletion, this causes the next attempt to create leads to stale information to be picked up.

That ends up not talking to Prometheus.

Wow. Yeah. We didn't compare them

func (c *Controller) checkAndCreatePrometheusAddlConfig(ctx context.Context, tenant *miniov2.Tenant, accessKey, secretKey string) error {
ns := miniov2.GetPrometheusNamespace()
p, err := c.getPrometheus(ctx)
if err != nil {
return err
}
// If the additional scrape config is set to something else, we will error out
if p.Spec.AdditionalScrapeConfigs != nil && p.Spec.AdditionalScrapeConfigs.Name != miniov2.PrometheusAddlScrapeConfigSecret {
return errors.New(p.Spec.AdditionalScrapeConfigs.Name + " is alreay set as additional scrape config in prometheus")
}
secret, err := c.kubeClientSet.CoreV1().Secrets(ns).Get(ctx, miniov2.PrometheusAddlScrapeConfigSecret, metav1.GetOptions{})
if err != nil && !k8serrors.IsNotFound(err) {
return err
}
promCfg := configmaps.GetPrometheusConfig(tenant, accessKey, secretKey)
// If the secret is not found, create the secret
if k8serrors.IsNotFound(err) {
klog.Infof("Adding MinIO tenant Prometheus scrape config")
scrapeCfgYaml, err := yaml.Marshal(&promCfg.ScrapeConfigs)
if err != nil {
return err
}
secret = &corev1.Secret{
Type: "Opaque",
ObjectMeta: metav1.ObjectMeta{
Name: miniov2.PrometheusAddlScrapeConfigSecret,
Namespace: ns,
},
Data: map[string][]byte{
miniov2.PrometheusAddlScrapeConfigKey: scrapeCfgYaml,
},
}
_, err = c.kubeClientSet.CoreV1().Secrets(ns).Create(ctx, secret, metav1.CreateOptions{})
if err != nil {
return err
}
} else {
var scrapeConfigs []configmaps.ScrapeConfig
err := yaml.Unmarshal(secret.Data[miniov2.PrometheusAddlScrapeConfigKey], &scrapeConfigs)
if err != nil {
return err
}
// Check if the scrape config is already present
hasScrapeConfig := false
for _, sc := range scrapeConfigs {
if sc.JobName == tenant.PrometheusOperatorAddlConfigJobName() {
hasScrapeConfig = true
break
}
}
if !hasScrapeConfig {
klog.Infof("Adding MinIO tenant Prometheus scrape config")
scrapeConfigs = append(scrapeConfigs, promCfg.ScrapeConfigs...)
scrapeCfgYaml, err := yaml.Marshal(scrapeConfigs)
if err != nil {
return err
}
secret.Data[miniov2.PrometheusAddlScrapeConfigKey] = scrapeCfgYaml
_, err = c.kubeClientSet.CoreV1().Secrets(ns).Update(ctx, secret, metav1.UpdateOptions{})
if err != nil {
return err
}
}
}
// Update prometheus if it's not done alreay
if p.Spec.AdditionalScrapeConfigs == nil {
p.Spec.AdditionalScrapeConfigs = &corev1.SecretKeySelector{
LocalObjectReference: corev1.LocalObjectReference{Name: miniov2.PrometheusAddlScrapeConfigSecret},
Key: miniov2.PrometheusAddlScrapeConfigKey,
}
_, err = c.promClient.MonitoringV1().Prometheuses(ns).Update(ctx, p, metav1.UpdateOptions{})
if err != nil {
return err
}
}
return nil
}

@bh4t bh4t added the bug fix label Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants