-
Notifications
You must be signed in to change notification settings - Fork 16
Sort trigger-arch build queue to push previously failed jobs to the end #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, a few questions
jq '.' <<<"$newFailedJobsJson" | tee pastFailedJobs.json | ||
''' | ||
archiveArtifacts( | ||
artifacts: '*.json', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why all .json
files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 we know the directory is definitely 100% empty (deleteDir()
above), so I'm not too worried about it either way
archiveArtifacts( | ||
artifacts: '*.json', | ||
fingerprint: true, | ||
onlyIfSuccessful: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean something other than es.result == 'SUCCESS'
? I'm wondering why the files are only archived on success when it seems these changes are focused on failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confusingly refers to the success of this job, not the success of all the jobs this job is triggering (nor all the stages of this job, which each have their own status). This is a nice way to give us a safety against losing our data since we're doing a very poor-man's linked list here and one hiccup means we lose our intermediate data. 😄
(Hence pulling the previous version of this file from the lastSuccessful
instead of just the most recent job.)
Jenkinsfile.trigger
Outdated
sh '''#!/usr/bin/env bash | ||
set -Eeuo pipefail -x | ||
|
||
jq '.' <<<"$newFailedJobsJson" | tee pastFailedJobs.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the string pastFailedJobs.json
be abstracted into a string since it is used several places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that could help cut down on typos, but it does get a little annoying to inject properly (we'd probably need to do something like env.PAST_FAILED_JOBS_FILENAME = ...
or something so that it can be accessed from all the shell invocations without dealing with quoting hell).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few really minor nits (none of which actually change the functionality so shouldn't need rigorous re-testing), but overall really solid and seems like it'll work just fine 👍
Jenkinsfile.trigger
Outdated
@@ -13,6 +13,10 @@ env.BASHBREW_ARCH = env.JOB_NAME.split('/')[-1].minus('trigger-') // "windows-am | |||
def queue = [] | |||
def breakEarly = false // thanks Jenkins... | |||
|
|||
// this Json includes the number of attempts per failing buildId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// this Json includes the number of attempts per failing buildId | |
// this JSON includes the number of attempts per failing buildId |
or even
// this Json includes the number of attempts per failing buildId | |
// this includes the number of attempts per failing buildId |
Jenkinsfile.trigger
Outdated
(env.pastFailedJobsJson | fromjson) as $pastFailedJobs | | ||
[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(env.pastFailedJobsJson | fromjson) as $pastFailedJobs | | |
[ | |
(env.pastFailedJobsJson | fromjson) as $pastFailedJobs | |
| [ |
Jenkinsfile.trigger
Outdated
] | ||
' builds.json | ||
''').trim() | ||
] | sort_by($pastFailedJobs[.buildId].count // 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure:
$ jq -n '{} | .foo.bar // "baz"'
"baz"
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, but maybe we should include some bit of my old TODO
comment here, to remind us what this is doing / for?
] | sort_by($pastFailedJobs[.buildId].count // 0) | |
] | |
# this Jenkins job exports a JSON file that includes the number of attempts so far per failing buildId so that this can sort by attempts which means failing builds always live at the bottom of the queue (sorted by the number of times they have failed, so the most failing is always last) | |
| sort_by($pastFailedJobs[.buildId].count // 0) |
Jenkinsfile.trigger
Outdated
count: c, | ||
identifier: identifier, | ||
url: res.absoluteUrl, | ||
endtime: (res.startTimeInMillis + res.duration) / 1000.0, // convert to seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endtime: (res.startTimeInMillis + res.duration) / 1000.0, // convert to seconds | |
endTime: (res.startTimeInMillis + res.duration) / 1000.0, // convert to seconds |
Jenkinsfile.trigger
Outdated
if (pastFailedJobs[buildObj.buildId]) { | ||
c += pastFailedJobs[buildObj.buildId].count | ||
} | ||
// TODO maybe implement some amount of backoff? keep first url/endtime? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO maybe implement some amount of backoff? keep first url/endtime? | |
// TODO maybe implement some amount of backoff? keep first url/endTime? |
Jenkinsfile.trigger
Outdated
sh '''#!/usr/bin/env bash | ||
set -Eeuo pipefail -x | ||
|
||
jq '.' <<<"$newFailedJobsJson" | tee pastFailedJobs.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jq '.' <<<"$newFailedJobsJson" | tee pastFailedJobs.json | |
jq <<<"$newFailedJobsJson" '.' | tee pastFailedJobs.json |
echo(res.result) | ||
if (res.result != 'SUCCESS') { | ||
def c = 1 | ||
if (pastFailedJobs[buildObj.buildId]) { | ||
c += pastFailedJobs[buildObj.buildId].count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to think about in the future / next round: if count
isn't set for some reason (but the object exists), this will be null
and we'll get an execption from java.lang.Integer#plus
(shouldn't matter for this because we're the only ones creating this file and it should always be well-formed, but this will be a good reminder for if/when something goes wrong that this line might need to be more defensive against the inputs).
f01bbf4
to
3eae8a9
Compare
3eae8a9
to
7d61be7
Compare
This will give unbuilt items a chance to build before others get a retry.
(did a few successful run tests via Jenkins
trigger-arm32v6
job)