Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-1095: Batch submit workflows to Cromwell #272

Merged
merged 13 commits into from
Dec 10, 2020

Conversation

samanehsan
Copy link
Contributor

@samanehsan samanehsan commented Dec 8, 2020

Purpose

https://broadinstitute.atlassian.net/browse/GH-1095

Changes

  • Start/exec all items in a XX or WGS workload to Cromwell using the batch endpoint, instead of sending a request for each item. The batch endpoint will apply the same labels to every workflow, so item-specific labels were removed.
  • Move submit-workload! to the batch module
  • Remove WGS module skip workflow functionality

Review Instructions

  • No instructions.

Copy link
Member

@ehigham ehigham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I wonder if you can share some of the logic with the xx module and move code into the batch module?

options
labels)
(post-workflow (str (api environment) "/batch"))
(mapv #(:id %))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to check that that the order of the UUIDs returned by submit-workflows matches the order of workflows passed in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I noticed from testing, but I couldn't find anything about that specifically in the Cromwell docs. I'll double check with them in slack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that is the behavior that the Cromwell team expects

(defn ^:private mock-really-submit-one-workflow [& _]
(UUID/randomUUID))
(defn ^:private mock-submit-workflows [& _]
[(UUID/randomUUID)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this really needs to return as many UUIDs as sets of workflow inputs that are passed in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@tbl3rd tbl3rd changed the title [1095] Batch submit workflows to Cromwell GH-1095: Batch submit workflows to Cromwell Dec 8, 2020
@samanehsan
Copy link
Contributor Author

I wonder if you can share some of the logic with the xx module and move code into the batch module?

There's definitely a lot of similar code between the two modules! I was getting caught up on WGS "skipping" some of the workflows but I can probably move that outside of submit-workload! to make that more generalized. Is that what you were thinking @ehigham?

@tbl3rd tbl3rd self-assigned this Dec 8, 2020
@@ -189,9 +189,8 @@

(defn make-workflow-labels
"Return the workflow labels from ENVIRONMENT, WDL, and INPUTS."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Return workflow labels for WDL."

options
labels)
(post-workflow (str (api environment) "/batch"))
(mapv #(:id %))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that just (mapv :id)?

"The WDL label applied to Cromwell metadata."
{(keyword wfl/the-name)
pipeline})
{(keyword wfl/the-name) pipeline})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete those docstrings. They have been out of date for many moons.

@@ -96,30 +96,13 @@

(defn ^:private make-cromwell-inputs
"Return inputs for reprocessing IN-GS into OUT-GS in ENVIRONMENT."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another dead docstring.

(-> (util/deep-merge cram-ref hack-task-level-values)
(util/deep-merge {:references default-references})
(util/deep-merge (env-inputs environment))
(util/deep-merge workflow-inputs)
(util/deep-merge inputs)
(util/prefix-keys (keyword pipeline))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call deep-merge once?

@@ -149,7 +132,7 @@
util/do-or-nil
seq))
(processing? [in-gs]
(->> {:label cromwell-label :status cromwell/active-statuses}
(->> {:label (stringify-cromwell-label (first cromwell-labels)) :status cromwell/active-statuses}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just inline this? Or just stringify it at top-level and be done?

@@ -161,36 +144,46 @@
out-gs (str (all/slashify output) object)]
(or (exists? out-gs) (processing? in-gs)))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this (or (exists? out-gs) (processing? in-gs) false) and see below.

#(assoc % :uuid util/uuid-nil
:status "skipped"
:updated (OffsetDateTime/now))
skipped-workflows)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipped-workflows -=> skipped

:status "skipped"
:updated (OffsetDateTime/now))
skipped-workflows)
(mapcat submit-batch! (group-by :options (get workflows nil))))))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(get workflows nil) -=> workflows

(map (partial make-cromwell-inputs env) workflows)
(util/deep-merge default-options options)
(merge cromwell-labels {:workload uuid}))))]
(let [workflows (group-by #(skip-workflow? env workload %) (:workflows workload))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(let [{skipped? true workflows false} (group-by #(skip-workflow? env workload %) (:workflows workload)) …
or maybe just handle the skipped workflows conditionally in line?

@ehigham
Copy link
Member

ehigham commented Dec 8, 2020

I wonder if you can share some of the logic with the xx module and move code into the batch module?

There's definitely a lot of similar code between the two modules! I was getting caught up on WGS "skipping" some of the workflows but I can probably move that outside of submit-workload! to make that more generalized. Is that what you were thinking @ehigham?

Ah yes the skip-workload? gift that keeps on giving. I wonder if we remove this and consider another way of tracking inputs/outputs in the future if it's something that's really required. Maybe I'll can create a jira to come back to this at another point.

Copy link
Contributor

@tbl3rd tbl3rd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Thought I'd approved this.

Copy link
Contributor

@jack-r-warren jack-r-warren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just one suggestion--since you're removing the workflow skipping, you can add ^:parallel to test-start-wgs-workload and test-exec-wgs-workload in wfl.system.v1-endpoint-test to have them run in parallel

@samanehsan samanehsan merged commit 064ab00 into main Dec 10, 2020
@rhiananthony rhiananthony deleted the se-batch-submit-to-cromwell branch March 17, 2021 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants