-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-1095: Batch submit workflows to Cromwell #272
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I wonder if you can share some of the logic with the xx
module and move code into the batch
module?
api/src/wfl/service/cromwell.clj
Outdated
options | ||
labels) | ||
(post-workflow (str (api environment) "/batch")) | ||
(mapv #(:id %)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to check that that the order of the UUIDs returned by submit-workflows
matches the order of workflows passed in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I noticed from testing, but I couldn't find anything about that specifically in the Cromwell docs. I'll double check with them in slack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that is the behavior that the Cromwell team expects
(defn ^:private mock-really-submit-one-workflow [& _] | ||
(UUID/randomUUID)) | ||
(defn ^:private mock-submit-workflows [& _] | ||
[(UUID/randomUUID)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this really needs to return as many UUIDs as sets of workflow inputs that are passed in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
There's definitely a lot of similar code between the two modules! I was getting caught up on WGS "skipping" some of the workflows but I can probably move that outside of |
api/src/wfl/service/cromwell.clj
Outdated
@@ -189,9 +189,8 @@ | |||
|
|||
(defn make-workflow-labels | |||
"Return the workflow labels from ENVIRONMENT, WDL, and INPUTS." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Return workflow labels for WDL."
api/src/wfl/service/cromwell.clj
Outdated
options | ||
labels) | ||
(post-workflow (str (api environment) "/batch")) | ||
(mapv #(:id %)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that just (mapv :id)
?
"The WDL label applied to Cromwell metadata." | ||
{(keyword wfl/the-name) | ||
pipeline}) | ||
{(keyword wfl/the-name) pipeline}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete those docstrings. They have been out of date for many moons.
api/src/wfl/module/wgs.clj
Outdated
@@ -96,30 +96,13 @@ | |||
|
|||
(defn ^:private make-cromwell-inputs | |||
"Return inputs for reprocessing IN-GS into OUT-GS in ENVIRONMENT." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another dead docstring.
(-> (util/deep-merge cram-ref hack-task-level-values) | ||
(util/deep-merge {:references default-references}) | ||
(util/deep-merge (env-inputs environment)) | ||
(util/deep-merge workflow-inputs) | ||
(util/deep-merge inputs) | ||
(util/prefix-keys (keyword pipeline)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call deep-merge
once?
api/src/wfl/module/wgs.clj
Outdated
@@ -149,7 +132,7 @@ | |||
util/do-or-nil | |||
seq)) | |||
(processing? [in-gs] | |||
(->> {:label cromwell-label :status cromwell/active-statuses} | |||
(->> {:label (stringify-cromwell-label (first cromwell-labels)) :status cromwell/active-statuses} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just inline this? Or just stringify it at top-level and be done?
api/src/wfl/module/wgs.clj
Outdated
@@ -161,36 +144,46 @@ | |||
out-gs (str (all/slashify output) object)] | |||
(or (exists? out-gs) (processing? in-gs))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this (or (exists? out-gs) (processing? in-gs) false)
and see below.
api/src/wfl/module/wgs.clj
Outdated
#(assoc % :uuid util/uuid-nil | ||
:status "skipped" | ||
:updated (OffsetDateTime/now)) | ||
skipped-workflows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skipped-workflows -=> skipped
api/src/wfl/module/wgs.clj
Outdated
:status "skipped" | ||
:updated (OffsetDateTime/now)) | ||
skipped-workflows) | ||
(mapcat submit-batch! (group-by :options (get workflows nil)))))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(get workflows nil)
-=> workflows
api/src/wfl/module/wgs.clj
Outdated
(map (partial make-cromwell-inputs env) workflows) | ||
(util/deep-merge default-options options) | ||
(merge cromwell-labels {:workload uuid}))))] | ||
(let [workflows (group-by #(skip-workflow? env workload %) (:workflows workload)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(let [{skipped? true workflows false} (group-by #(skip-workflow? env workload %) (:workflows workload)) …
or maybe just handle the skipped workflows conditionally in line?
Ah yes the |
2ff4983
to
945351f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. Thought I'd approved this.
9fca563
to
6b45714
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just one suggestion--since you're removing the workflow skipping, you can add ^:parallel
to test-start-wgs-workload
and test-exec-wgs-workload
in wfl.system.v1-endpoint-test to have them run in parallel
Purpose
https://broadinstitute.atlassian.net/browse/GH-1095
Changes
submit-workload!
to the batch moduleReview Instructions