Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecated cadvisor metrics #747

Closed
pierluigilenoci opened this issue Jan 22, 2021 · 30 comments
Closed

Deprecated cadvisor metrics #747

pierluigilenoci opened this issue Jan 22, 2021 · 30 comments
Labels
bug Something isn't working

Comments

@pierluigilenoci
Copy link

pierluigilenoci commented Jan 22, 2021

Describe the bug
According to Kubernetes instrumentation guidelines the pod_name and container_name metrics are deprecated since almost 2 years and not available since K8s 1.16.

Ref:

In your integration guide, you suggest changing the default Prometheus configuration to generate them again. https://github.com/kubecost/docs/blob/master/custom-prom.md#cadvisor-metric-labels

I find it wrong to have to force Prometheus configuration to generate the old metrics otherwise the software won't work. It makes more sense that the software uses the official metrics and, in case someone uses a legacy version of Kubernetes (<1.16), they introduce this change in their Prometheus installation.

Doing so forces those who use a separate installation of Prometheus to use a non-default configuration compared to the official chart.

Expected behavior
Do not force the change of third party software configurations to make Kubecost work

@pierluigilenoci pierluigilenoci added the bug Something isn't working label Jan 22, 2021
@dwbrown2
Copy link
Contributor

dwbrown2 commented Jan 25, 2021

We introduced these changes to continue supporting legacy versions. I agree we should drop this requirement / move away from this practice. There is an easy workaround available by adding two labels.

@nikovacevic @AjayTripathy

@AjayTripathy
Copy link
Contributor

Hi @pierluigilenoci -- we will look into this. The main complexity is we want to support both versions seamlessly, as not all of our users can upgrade and it would also be a bad experience to upgrade and for these to all stop working.

@pierluigilenoci
Copy link
Author

Hi @AjayTripathy, I can understand that you want to keep backward compatibility but it shouldn't be done at the expense of new versions. In the charts, it is the norm, when some changes break the compatibility to announce the breaking changes and a migration guide. Anyone who wants to upgrade the software but still remains at the legacy version of the cluster can apply the workarounds.

In your chart instead, you impose workarounds and over-configurations on those who keep their clusters updated to favor legacy versions. In this way, all those who want to start using the software in updated clusters make a bad experience.

@AjayTripathy
Copy link
Contributor

Yes, we are ok to take this on, just trying to give context on why this change isn't a simple one-liner to relabel.

@sergeyshevch
Copy link

Hi! Is there any progress?

@AjayTripathy
Copy link
Contributor

Hi @SergeyKons we're making some major changes in our prometheus queries for improved accuracy and performance: opencost/opencost#690

This simplifies a bunch of our queries and will have the effect of making the discussed changes easier, but it does mean that these changes are blocked on that.

@pierluigilenoci
Copy link
Author

@AjayTripathy so is closed opencost/opencost#690 in favor of opencost/opencost#719 .
What about the deprecated cadvisor metrics?

@pierluigilenoci
Copy link
Author

@dwbrown2 any news about this?

@dwbrown2
Copy link
Contributor

hi @pierluigilenoci, thanks for following up. we will provide updates when they are available.

@pierluigilenoci
Copy link
Author

@dwbrown2 any update?

@kubecost kubecost deleted a comment from pierluigilenoci Apr 9, 2021
@dwbrown2
Copy link
Contributor

dwbrown2 commented Apr 9, 2021

@pierluigilenoci we have started investigating as shared and will provide updates on this issue when available.

@pierluigilenoci
Copy link
Author

Hi @dwbrown2, I noticed only now that you have deleted my comment. Why?

@pierluigilenoci
Copy link
Author

Just to put more on the table... K8s v1.16 is EOL since 2020-09-02, it is time to move on. ;)
Ref: kubernetes/sig-release@master/releases/patch-releases.md#non-active-branch-history

@dwbrown2
Copy link
Contributor

@pierluigilenoci plenty of users still run k8s versions less than v1.16.

For all tracking this issue, we've completed our discussion as a team. An engineer is starting to explore solutions on our end now. We will share further updates as they become available!

@pierluigilenoci
Copy link
Author

Now in addition to the deprecated cadvisor metrics, there are also those of kube-state-metrics.

It's been 115 days since I opened this ticket... 🕐 is it time to hurry up? 🚀 ❤️ or 🙉 👎🏻

@pierluigilenoci
Copy link
Author

KSM deprecated metrics kubecost/docs#74

@dwbrown2
Copy link
Contributor

This is in process and we will share updates when they are available.

@pierluigilenoci
Copy link
Author

@AjayTripathy I saw that the KSM metrics have been fixed. Is there any hope of having the CAdvisor metrics fixed as well?

Is there also an update of the page for the integration?

@dwbrown2
Copy link
Contributor

Yep, we tackled ksm first and will be looking at cadvisor next. It's already underway.

@pierluigilenoci
Copy link
Author

Yep, we tackled ksm first and will be looking at cadvisor next. It's already underway.

@dwbrown2 I LOVE IT ❤️

@pierluigilenoci
Copy link
Author

@dwbrown2 another month has passed, is there any news?

@AjayTripathy
Copy link
Contributor

opencost/opencost#881

Here's the start of a PR with backwards compatibility for pre-k8s1.16 users.

@dwbrown2
Copy link
Contributor

PR now merged so that core Kubecost handled current + legacy cadvisor metrics.

Docs PR out for review -- kubecost/docs#119

@pierluigilenoci
Copy link
Author

So I presume we can close this. 🚀

@pierluigilenoci
Copy link
Author

@kirbsauce @AjayTripathy I'm sorry to inform you that the problem is not really solved.
With v1.86.0 you get this warning on the diagnostics page:

Screenshot 2021-09-09 at 12 02 04

This happens because the check is done with container_name and pod_name which are the deprecated metrics.

@nikovacevic
Copy link
Contributor

Thanks @pierluigilenoci, based on this I will reopen until we complete the fix.

@nikovacevic nikovacevic reopened this Sep 9, 2021
@AjayTripathy
Copy link
Contributor

Just a quick note, this looks just like we regressed the diagnostic test when moving it to the backend. The actual cost-model is using the new queries.

@AjayTripathy
Copy link
Contributor

opencost/opencost#931 Merged and released in 1.86.1, apologies.

@pierluigilenoci
Copy link
Author

@AjayTripathy the new version fix this issue.

@kirbsauce
Copy link
Contributor

Thanks for the confirmation, @pierluigilenoci ! Closing this one out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants