-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Failed analysis to degrade rollout when multiple metrics are analyzed #1535
Conversation
Signed-off-by: hari rongali <hari_rongali@intuit.com>
Signed-off-by: hari rongali <hari_rongali@intuit.com>
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
Codecov Report
@@ Coverage Diff @@
## master #1535 +/- ##
==========================================
+ Coverage 81.67% 81.77% +0.09%
==========================================
Files 110 112 +2
Lines 14798 15071 +273
==========================================
+ Hits 12086 12324 +238
- Misses 2078 2103 +25
- Partials 634 644 +10
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…lyzed (#1535) * fix: analysis fail for inline multi metric analysis Signed-off-by: hari rongali <hari_rongali@intuit.com> * fix: cleanup Signed-off-by: hari rongali <hari_rongali@intuit.com>
@harikrongali I dismissed my earlier review because I remember there was a reason the code was written this way and I want to make sure we dont break it. I recall that we did not want to mark the AnalysisRun completed until everything which needed to do was completed. This includes stopping in-flight jobs. Can you verify the following scenario:
Will the AnalysisRun stop the in-flight job? I am concerned that by marking it Successful immediately, we will leave the job running forever. |
@jessesuen i will validate asap |
Nevermind. I think we are good. This is the scenario I was concerned about: kind: AnalysisRun
apiVersion: argoproj.io/v1alpha1
metadata:
generateName: analysis-run-job-
spec:
metrics:
- name: dont-finish
provider:
job:
spec:
template:
spec:
containers:
- name: sleep
image: nginx:1.19-alpine
command: [sleep, "999999"]
restartPolicy: Never
backoffLimit: 0
- name: fail
provider:
job:
spec:
template:
spec:
containers:
- name: sleep
image: nginx:1.19-alpine
command: [sh, -c, "sleep 10 && exit 1"]
restartPolicy: Never
backoffLimit: 0
- name: dont-start
initialDelay: 24h
provider:
job:
spec:
template:
spec:
containers:
- name: sleep
image: nginx:1.19-alpine
command: [sh, -c, "exit 0"]
restartPolicy: Never
backoffLimit: 0 I verified nothing is left running when the second metric fails. |
Checklist:
Fixes behavior #1411 although issue is reproducible in 1.0.6 release because of fix #1407
"fix(controller): Updates such and such. Fixes #1234"
.