Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend): Added metrics to be collected from failed/successful workflows #9576

Merged
merged 4 commits into from
Oct 10, 2023

Conversation

MGSousa
Copy link
Contributor

@MGSousa MGSousa commented Jun 6, 2023

Description of your changes:

  • Added metrics related to failed and successful workflow runs to be exported to Prometheus in ResourceManager.
  • These new metrics can be filtered by each Namespace (profile) and by DisplayName (workflow)
  • Improvements in how the gauge registers the count by using a Collector.
    • Checks a specific metric when deleting a run if has already reached the minimum value of 0.

Tests are also updated to reflect the new changes

Checklist:

@google-oss-prow
Copy link

Hi @MGSousa. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@connor-mccarthy
Copy link
Member

/assign @Linchin

@Linchin
Copy link
Contributor

Linchin commented Jun 13, 2023

/ok-to-test

@MGSousa
Copy link
Contributor Author

MGSousa commented Jul 31, 2023

Do you have any feedback related to this feature?

@Linchin Linchin removed their assignment Aug 28, 2023
@zijianjoy
Copy link
Collaborator

/assign @chensun

Copy link
Member

@chensun chensun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Just a couple questions.

backend/src/apiserver/resource/resource_manager.go Outdated Show resolved Hide resolved
@google-oss-prow google-oss-prow bot added the lgtm label Sep 1, 2023
@google-oss-prow google-oss-prow bot removed the lgtm label Sep 5, 2023
Copy link
Member

@chensun chensun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks!

@google-oss-prow google-oss-prow bot added the lgtm label Oct 10, 2023
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chensun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 5835824 into kubeflow:master Oct 10, 2023
5 checks passed
stijntratsaertit pushed a commit to stijntratsaertit/kfp that referenced this pull request Feb 16, 2024
…orkflows (kubeflow#9576)

* feat(backend): Allow more metrics to be collected from Workflows

* Fixed remaining tests

* Updated licenses dependencies

* FIX comment in resource_manager.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants