Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GH-1181] Ingest Workflow Outputs into TDR #321

Merged
merged 16 commits into from
Feb 17, 2021

Conversation

ehigham
Copy link
Member

@ehigham ehigham commented Feb 10, 2021

Purpose

https://broadinstitute.atlassian.net/browse/GH-1181

This change is a demonstrator for generic workflow output ingest into TDR. The main idea is:

  • Use womtool to describe the outputs of the WDL. Using this description, build a type environment (a mapping from name -> type).
  • When ingesting outputs, lookup the type of the output and dispatch on the type to a specific handler to perform any work needed to ingest that object
    • e.g. ingest a file to get its TDR resource identifier
  • Transform the result into some form that can be ingested as a row in a dataset table (rename-gather)
    • map output names onto dataset table names (rename)
    • collect together the outputs (gather)
  • Upload result and have TDR ingest it.

The main file to look at is api/test/wfl/integration/datarepo_test.clj.
In the test, I've made use of thunks to delay work that can be performed later (e.g. polling). I've tried to add comments to clarify the intent. Let me know if you have questions.

Base automatically changed from ehigham/GH-1176-assemble-refbased-outputs-schema to main February 10, 2021 18:37
@ehigham ehigham force-pushed the ehigham/GH-1181-ingest-outputs branch 4 times, most recently from 7b3c475 to a62643a Compare February 11, 2021 21:14
@ehigham ehigham force-pushed the ehigham/GH-1181-ingest-outputs branch from a62643a to 4caf1bf Compare February 13, 2021 23:02
@@ -110,6 +92,49 @@
(create-local-database name)
(setup-local-database name)))

(defmacro with-fixtures
Copy link
Member Author

@ehigham ehigham Feb 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this should be a macro or a function.
If I made it a function, it would itself be a fixture (by symmetry) and passed to other with-fixtures calls which might be kinda nice. On the other hand, we'd have to use more partial applications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess you can do this as a macro?

(fixtures/with-fixtures
     [(fixtures/with-fixtures 
        [(fixtures/with-fixtures 
           [(fixtures/with-fixtures [])])])]
    identity)
=> [[[[]]]]

@ehigham ehigham marked this pull request as ready for review February 14, 2021 19:52
@tbl3rd tbl3rd self-requested a review February 16, 2021 15:04
@rexwangcc rexwangcc self-requested a review February 16, 2021 15:20
Copy link
Contributor

@tbl3rd tbl3rd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for upload-content.
Not sure how with-fixtures differs from use-fixtures,
but confident using it will bring understanding.

@ehigham
Copy link
Member Author

ehigham commented Feb 17, 2021

Thank you for upload-content.
Not sure how with-fixtures differs from use-fixtures,
but confident using it will bring understanding.

The difference comes from what I've called a fixture and what clojure test calls a fixture. In clojure test, a fixture has the form [1]:

(defn my-test-fixture [f]
  (create-resource)
  (f)
  (destroy-resource))

And it's applied to the testing environment via use-fixtures:

(use-fixtures my-test-fixture)

The thunk f passed to the fixture cannot have state passed into it - it is the test itself. Thus fixtures in the clojure test sense have to modify a global variable.
I've introduced a different kind of fixture that allows values to be passed into the function. They can be applied within the test itself have the form:

(defn with-my-fixture [data use]
  (let [resource (acquire data)]
    (try
      (use resource)
      (finally
        (release resource)))))

It's used like so

(deftest my-test
  (with-my-fixture state
    (fn [resource] ... )))

with-fixtures is a combinator for using multiple of these fixtures without excess indentation:

(deftest my-test
  (with-my-fixture0 state0
    (fn [resource0]
      (with-my-fixture1 state1
        (fn [resource1]
          (with-my-fixture2 state2
            (fn [resource2] ... )))))))

becomes:

(deftest my-test
  (with-fixtures [(with-my-fixture0 state0)
                 (with-my-fixture1 state1) 
                 ...
                 (with-my-fixtureN stateN))
    (fn [[resource0 resource1 ... resourceN]] ...)))

[1] https://clojuredocs.org/clojure.test/use-fixtures

Copy link
Contributor

@rexwangcc rexwangcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with-fixtures is really interesting, I can see macro makes sense here.

api/test/wfl/integration/datarepo_test.clj Outdated Show resolved Hide resolved
@ehigham ehigham merged commit d9163f8 into main Feb 17, 2021
@ehigham ehigham deleted the ehigham/GH-1181-ingest-outputs branch February 17, 2021 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants