-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GH-1354] Enforce snapshotReaders
are email addresses
#428
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ehigham
force-pushed
the
ehigham/GH-1354-validate-snapshot-readers
branch
from
June 2, 2021 16:03
2476dbe
to
e72367d
Compare
RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo.
ehigham
force-pushed
the
ehigham/GH-1354-validate-snapshot-readers
branch
from
June 2, 2021 16:03
e72367d
to
e966447
Compare
okotsopoulos
approved these changes
Jun 2, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
ehigham
added a commit
that referenced
this pull request
Jun 2, 2021
RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo.
ehigham
added a commit
that referenced
this pull request
Jun 9, 2021
RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo.
ehigham
added a commit
that referenced
this pull request
Jun 9, 2021
RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo.
ehigham
added a commit
that referenced
this pull request
Jun 9, 2021
* Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * bump develop to 0.7.0 (#360) * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](npm/ssri@v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](unshiftio/url-parse@1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](npm/hosted-git-info@v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](lodash/lodash@4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. * [GH-1350] fix Terra Workspace sink validation (#427) RR: https://broadinstitute.atlassian.net/browse/GH-1350 The "Terra Workspace" sink validation did not validate that the attributes listed in fromOutputs exist in the specified entity type in the workspace. This PR fixes this. * Bump dns-packet from 1.3.1 to 1.3.4 in /ui Bumps [dns-packet](https://github.com/mafintosh/dns-packet) from 1.3.1 to 1.3.4. - [Release notes](https://github.com/mafintosh/dns-packet/releases) - [Changelog](https://github.com/mafintosh/dns-packet/blob/master/CHANGELOG.md) - [Commits](mafintosh/dns-packet@v1.3.1...v1.3.4) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1325] Get or import snapshot (#422) Within rawls namespace: - get-snapshot-references: lazily paginate through all snapshot references in a workspace. - create-or-get-snapshot-reference: if importing the snapshot fails with a 409, find the first existing snapshot reference matching the snapshot id. * [GH-1354] Enforce `snapshotReaders` are email addresses (#428) RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo. * [GH-1356] Fix Coercion Failure for TDR Snapshots Source (#429) RR: https://broadinstitute.atlassian.net/browse/GH-1356 Caused by metosin/reitit#494 Fix as per https://broadinstitute.atlassian.net/browse/GH-1348 Included integration test for reitit coercions to catch this sooner. * [GH-1357] Fix failure to sink workflow outputs (#431) RR: https://broadinstitute.atlassian.net/browse/GH-1357 I assumed (wrongly) the way Rawls represented workflows in a submission was the same as how cromwell represented workflows. I also mocked these representations incorrectly in tests. To fix this, only use the submission to fetch workflow IDs, then use firecloud's workflow and workflow/outputs endpoints so we don't have to adapt/mock multiple data models. * [GH-1359] Overwrite Workspace Entity On Sink (#432) RR: https://broadinstitute.atlassian.net/browse/GH-1359 When sinking a resubmitted workflow, the TerraWorkspaceSink would append to the existing entity in the workspace. This leaves the entity in an even more wrong state than the reason the sample was reanalysed. Our fix (as agreed with cloreth) is to clobber the entity in the workspace by deleting the entity if it exists then upserting the new one. * Fix broken requirement install command in docs readme (#434) * Move docstring and document when true. (#436) The PR check fails in the test that Ed excludes on another branch. * [GH-1361] Disable Failing Tests for v0.7.0 Release (#435) RR: https://broadinstitute.atlassian.net/browse/GH-1361 This is a low risk change for v0.7.0 as this test doess not exercise reachable product code. Disabling: - wfl.integration.modules.arrays-test/test-update-arrays-workload! * [GH-1363] Fix Sporadic Firecloud Test Failures (#437) RR: https://broadinstitute.atlassian.net/browse/GH-1363 Caused by firecloud returning "Launching" as a workflow status - we were testing that the status was in #{"Queued" "Submitted"}. Fix util/poll so logical FALSE can be returned from the action. * Bump ws from 6.2.1 to 6.2.2 in /ui (#433) Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/commits) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * update changelog for v0.7.0 Co-authored-by: Rhian Anthony <rhian.anthony@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Rex <rexwangcc@gmail.com>
rhiananthony
added a commit
that referenced
this pull request
Sep 21, 2021
* Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * bump develop to 0.7.0 (#360) * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md in develop (#362) cherry-pick from main * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Merge v0.6.1 into develop (#366) * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](npm/ssri@v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](unshiftio/url-parse@1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](npm/hosted-git-info@v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](lodash/lodash@4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * Bump browserslist from 4.13.0 to 4.16.6 in /ui Bumps [browserslist](https://github.com/browserslist/browserslist) from 4.13.0 to 4.16.6. - [Release notes](https://github.com/browserslist/browserslist/releases) - [Changelog](https://github.com/browserslist/browserslist/blob/main/CHANGELOG.md) - [Commits](browserslist/browserslist@4.13.0...4.16.6) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * bump version in develop to 0.8.0 (#413) * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. * [GH-1350] fix Terra Workspace sink validation (#427) RR: https://broadinstitute.atlassian.net/browse/GH-1350 The "Terra Workspace" sink validation did not validate that the attributes listed in fromOutputs exist in the specified entity type in the workspace. This PR fixes this. * Bump dns-packet from 1.3.1 to 1.3.4 in /ui Bumps [dns-packet](https://github.com/mafintosh/dns-packet) from 1.3.1 to 1.3.4. - [Release notes](https://github.com/mafintosh/dns-packet/releases) - [Changelog](https://github.com/mafintosh/dns-packet/blob/master/CHANGELOG.md) - [Commits](mafintosh/dns-packet@v1.3.1...v1.3.4) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1324] Get or import snapshot (#422) Within rawls namespace: - get-snapshot-references: lazily paginate through all snapshot references in a workspace. - create-or-get-snapshot-reference: if importing the snapshot fails with a 409, find the first existing snapshot reference matching the snapshot id. * [GH-1354] Enforce `snapshotReaders` are email addresses (#428) RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo. * [GH-1356] Fix Coercion Failure for TDR Snapshots Source (#429) RR: https://broadinstitute.atlassian.net/browse/GH-1356 Caused by metosin/reitit#494 Fix as per https://broadinstitute.atlassian.net/browse/GH-1348 Included integration test for reitit coercions to catch this sooner. * [GH-1357] Fix failure to sink workflow outputs (#431) RR: https://broadinstitute.atlassian.net/browse/GH-1357 I assumed (wrongly) the way Rawls represented workflows in a submission was the same as how cromwell represented workflows. I also mocked these representations incorrectly in tests. To fix this, only use the submission to fetch workflow IDs, then use firecloud's workflow and workflow/outputs endpoints so we don't have to adapt/mock multiple data models. * [GH-1359] Overwrite Workspace Entity On Sink (#432) RR: https://broadinstitute.atlassian.net/browse/GH-1359 When sinking a resubmitted workflow, the TerraWorkspaceSink would append to the existing entity in the workspace. This leaves the entity in an even more wrong state than the reason the sample was reanalysed. Our fix (as agreed with cloreth) is to clobber the entity in the workspace by deleting the entity if it exists then upserting the new one. * Fix broken requirement install command in docs readme (#434) * Move docstring and document when true. (#436) The PR check fails in the test that Ed excludes on another branch. * [GH-1363] Fix Sporadic Firecloud Test Failures (#437) RR: https://broadinstitute.atlassian.net/browse/GH-1363 Caused by firecloud returning "Launching" as a workflow status - we were testing that the status was in #{"Queued" "Submitted"}. Fix util/poll so logical FALSE can be returned from the action. * Bump ws from 6.2.1 to 6.2.2 in /ui (#433) Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/commits) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * GH-1375: Fix AoU nightly test: GET /api/v1/workload fails with spec error from invalid creator (#442) * Don't spec the batch creator as an email address. * update changelog for v0.7.0 (#441) * GH-1332 wfl.tools.workloads/when-done should not exit prematurely (#438) If checking that all workloads are finished, also check that the workload list is nonempty. Fixed incorrect reference to static workload object from workload creation: we instead want to examine the latest version of the workload. * update changelog for v0.7.1 (#445) * Removing the arrays module and the corresponding integration test (#446) * Removing the arrays module and the corresponding integraiton test * Removed references to the deleted arrays module. Also ran all of the integration tests and verified that they pass * Updating the doc for the aou arrays module to have the appropriate name Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1376] Moved generic pipeline processing interfaces from covid -> stage ns (#444) * [GH-1388] Remove `pipeline-versions` from `GET /version` RR: https://broadinstitute.atlassian.net/browse/GH-1388 Inline the `version` key and add a `spec` for the response from the endpoint. Added system tests. * [GH-1389] Delete `^:excluded` arrays system tests RR: https://broadinstitute.atlassian.net/browse/GH-1389 These tests were missed in #446 * Resolving NPM security vulnerabilities (#448) * Fixing dependabot reported security issues as well as build reported npm audit security issues * Fixing lib for is-glob. Part of UI npm packages * Some updates. Some left to do * adding package.lock * Fix by using npm update. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Rex <rexwangcc@gmail.com> * GH-1371: Lint more. (#439) * Reduce noise while kibit'zing. * Add the Eastwood cLinter. * cLint Eastwood api/src and dampen noise. * Lint tests too. * Squash bugs and squelch noise. * Fix tests now that they "work". * nil-if-empty -=> not-empty * Disable some eastwood warnings. * :staus -=> :status * Suppress more warnings. * Expand namespace aliases in keywords. * Update Kondo. * LINT -=> FORMAT * Continue on LINT errors. * Checkpoint some kondo advice. * dotimes -=> repeatedly * Revert "Expand namespace aliases in keywords." * Build UserException to support Eastwood. * repeatedly -=> dotimes * Restore code deleted for debugging. * Work around: unused-fn-args: Function arg p2__11676# never used * Maybe explain foldl better. * [GH-1294] Migrate to use datasets in prod TDR (#453) * Update default values. * Use a dataset in production for integration tests. * Lint. * Remove unused ENV VAR WFL_TDR_SA. * Address comments. * Add sleep to avoid the transient 404s from TDR. * GH-1346 Document workload Executor and Terra implementation (#451) * [GH-1345] Document the workload `Source` and Implementations (#450) RR: https://broadinstitute.atlassian.net/browse/GH-1345 Implementations include: - `Terra DataRepo` Source - `TDR Snapshots` Source * Clean up my testing code... (#456) * Phoned a friend @tbl3rd to fix return type hint in executor docs (#457) * [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation (#449) [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation RR: https://broadinstitute.atlassian.net/browse/GH-1347 * Move `source` and `sink` Interfaces and Implementations into new namespace (#454) * [GH-1334] Move `source` Interface and Implementation into new namespace RR: https://broadinstitute.atlassian.net/browse/GH-1334 Start to move tests too. * [GH-1335] Move `Executor` Code to new Namespace RR: https://broadinstitute.atlassian.net/browse/GH-1335 Doc-strings from #451 * [GH-1408] Don't modify source code on build Generate UserException on prebuild Make linters depend on prebuild. They're run as part of the PR action and can be run in parallel locally. Remove formatting target - use deps for that. * [GH-1393]: Add `/retry` Endpoint (#452) * [GH-1393]: Add `/retry` Endpoint (#452) RR: https://broadinstitute.atlassian.net/browse/GH-1393 Add `POST /api/v1/workload/{uuid}/retry` route, taking the status as the request json body. Add `retry` multimethod to the workload interface. This operation takes the workload as well as as list of workflows to retry. Add `workflows-by-status` multimethod to take a status to filter by. No workloads support `retry` as yet so all implementations throw a 501. Added system and integration tests to lock down this behaviour. * Add test for [GH-1385] RR: https://broadinstitute.atlassian.net/browse/GH-1385 * fix build failures in develop (#458) * Documenting a Workload (#430) * rebasing off of newest develop. Adding initial documentation for documentation of a workload. * Adding covid workload module * adding -m flag to python pip command * Added covid-module to the mkdocs.yaml Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> * Changing order of executor block in docs (#459) Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * GH-1326 Move Sink code to new Namespace (#463) * Split specs into their specific modules (#460) * GH-1289 split out covid specs * GH-1289 moved more of the module specific specs to the modules * couple of fixes * move tdr source to source.clj * [GH-1398] Fix `test-start-aou-workload` from polling forever (#462) RR: https://broadinstitute.atlassian.net/browse/GH-1398 Split `workloads/when-done` into two functions: - one that polls until the workload is :finished - another that polls all the workflows Add an assertion to the automation-test * [GH-1401] Add `retry` Attribute to TerraExecutorDetails Type and Instances Thereof (#461) * add lint to the check target * add liquibase changelog to add `retry` to terraexecutordetails * don't return retried workflows form the executor * add a long comment about returning workflows * Fix some namespace issues (#464) * [GH-1409] Add Status Query String to workflows GET request (#466) * GH-1409 Add Status Query String to workflows GET request * Bring cromwell statuses up to date * add test for workflows by status endpoint * [GH-1409] fix some linting errors * Add On Hold back in, trim slashes for endpoints * change to not= * split statuses into rawls workspace * linters hate spaces * Minor changes * Remove trim-slashes and fix endpoints in wfl.tools namespace (#469) * [GH-1411] Describe Workflow Outputs (#471) RR: https://broadinstitute.atlassian.net/browse/GH-1411 Inform the Sink downstream of the executor of the workflow type. When sinking outputs to the TDR, you need to ingest the file outputs followed by the metadata with the files replaced by their fileref URIs. Thus, we need a way of determining if a particular output is a File that needs ingesting or just a metadata String that happens to be a gs url. I've had some luck in the past using womtool to describe the WDL and then walk the description of the outputs, dispatching to the relevant handlers (see workflows.clj in the test tools). For lack of a better alternative, I'm going to pursue using womtool to handle outputs in a generic way. Implemented by changing the source and executors to be queues of `[description value]` pairs. In the case of the TDR sources, the description is a qualified keyword describing what the object is (ie. a snapshot). For the terra executor, the description is the workflow description as reported by womtool and the value is the workflow. * Logging added to source namespace (#475) And downgraded level for several noisy logs. * GH-1414 Executor falls back to fetching existing snapshot reference (#477) * [GH-1139] Added new log namespace (#470) * Added new log namespace * Some updates to log namespace * Replace references to clojure.tools.logging with new log namespace * Remove trace logs * add logger protocol to support disabling, set up unit tests * Remove last remaining references to clojure.tools.logging and update documentation * fix linter error * Remove log4j and slf4j dependencies * Keep JSON logs out of EDN description files. * fix for logs getting into workflow edn files * Not sure how this ever worked though. * Fuss. * Simplify. * Restore docstrings. * Force build to suckseed. * Simplify. * Ensure do-or-nil returns nil. * Break up long lines. * change jsonpayload to message and remove obsolete logging tests * reverted some changes back * update docstrings and convert some str's to joins * added notice to documentation * added some documentation for looking up logs locally Co-authored-by: Tom Lyons <tbl3rd@gmail.com> * Remove error txt file (#479) * GH-1417: Remove the "skipped" nil UUID Cromwell status hack. (#393) * Remove the "skipped" nil UUID Cromwell status hack. * Patch up the rebase. * Fix rebase conflict. * Patch up rebase. * The status-counts fn is an old Zeroism. * Update docstrings. * Remove duplicate batch/workload-request spec. * Use the batch-namespaced spec. * Glean some lint that collected. * [GH-1419] Terra Executor queue length should consider workflows with null status (#481) Added new integration test. Also in existing integration tests, corrected faulty assumption in mocking: Firecloud returns differently formatted workflows depending on whether they are fetched as part of a submission fetch, or fetched individually. * GH-1397: Sweep and snapshot dataset row IDs that were missed when polling a dataset (#468) * Document Swagger validation of API specs. * Mock snapshots missing TDR row IDs. * Mock datarepo/query-table-between instead of source/find-new-rows. * Verify that rows are shared across updates. * Work around deprecation warning. * Do not call start-source! twice. * Goodby Fibonacci. * Simplify. * Add earliest and latest and clean up. * Make find-new-rows work with combine-tdr-source-details. * Run `clojure -M:format` in this directory * Glean more lint. * Should probably count rows in running snapshots too. * Respond to more Ed comments. * [GH-1139] Logging level and configuration (#480) * Logging level and configuration * unit testing for logging levels * few fixes/pr changes * use prepared statement * GH-1395: Document the new WFL retry capability for users. (#483) * The /retry endpoint is not implemented. * Address some comments from Ed. * Add a note from OK. * [GH-1394] TerraExecutor retry implementation and Sarscov2IlluminaFull integration * [GH-1395] GitHub Pages updates: COVID, staged workflows, retry functionality (#489) * Bump path-parse from 1.0.6 to 1.0.7 in /ui (#491) * Mkdocs updates that improve the doc nav and security holes. (#492) * [GH-1402] Write Workflow Outputs to the Terra Data Repository (#474) * [GH-1405] Add Schema for TerraDataRepoSink (#473) [GH-1405] Add Schema for TerraDataRepoSink RR: https://broadinstitute.atlassian.net/browse/GH-1405 Add schemas for the following types - TDRJobStatus - TDRJobType - TerraDataRepoSinkDetails and the TerraDataRepoSink table. * [GH-1314] Start Implementing DataRepo Sink (#476) RR: https://broadinstitute.atlassian.net/browse/GH-1413 Add load/create functions for the data repo sink - leaving update as "unimplemented". * [GH-1415] Implement `update-sink` for TerraDataRepoSink (#478) RR: https://broadinstitute.atlassian.net/browse/GH-1415 This change contains a rough implementation of the TerraDataRepo sink update functionality. After discovering certain hurdles in the original design, I've made some tweaks and incorporated TDR's work https://broadworkbench.atlassian.net/browse/DR-1960 so that we only need one ingest request instead of separately loading the output files and output metadata. At the time of writing, DR-1960 is work in progress so our ingests won't work just yet. Some things to call out in this change that I'm deferring to a follow up change into the feature branch: I need to upload the json file for TDR to ingest. I'm currently using our test outputs bucket as a temporary folder for this which is obviously not a production-ready solution. Possible fixes include creating a scratch bucket for workflow-launcher, creating a bucket per workload, using the executor's execution bucket... etc. Suggestions welcome. Since the ingests always fail, I'm leaving some of the work post-ingest as a TODO. * [GH-1425] Expose `TerraDataRepoSink` Via the HTTP API (#484) RR: https://broadinstitute.atlassian.net/browse/GH-1425 Add specs and register in wfl.api/spec. Add end-to-end system test for reading/writing workflow inputs/outputs to TDR (used the illumina_genotyping_array pipeline as it's smaller than sarscov2_illumina_full). Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * [GH-1416] Document DataRepo Sink (#485) RR: https://broadinstitute.atlassian.net/browse/GH-1416 Add section to `sink.md` describing how to configure workflow-launcher to write outputs back to a terra datarepo dataset. * [GH-1430] Remove `stage/validate-or-throw` and inline their implementations into `create-X`. (#490) RR: https://broadinstitute.atlassian.net/browse/GH-1430 Remove stage/validate-or-throw and inline their implementations in their respective create-X functions. This is done because - we were losing context (ie. was this a source/sink/executor) and had a collision in the tdr sink and source impls - these multimethods didn't really make sense and require some redesign work * address @okotsopoulos's feedback Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1328 Migrate to new Rawls Snapshot V2 endpoints (#493) * ENG-1394 Update post-retry error maps per QA feedback (#494) * Fixing documentation merge conflict * GH-1454 Update covid and executor integration tests to reflect updated method configuration version (#504) Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: Edmund Higham <edhigham@gmail.com> Co-authored-by: rfricke-asymmetrik <75337761+rfricke-asymmetrik@users.noreply.github.com>
rhiananthony
added a commit
that referenced
this pull request
Sep 22, 2021
* Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * bump develop to 0.7.0 (#360) * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md in develop (#362) cherry-pick from main * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Merge v0.6.1 into develop (#366) * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](npm/ssri@v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](unshiftio/url-parse@1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](npm/hosted-git-info@v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](lodash/lodash@4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * Bump browserslist from 4.13.0 to 4.16.6 in /ui Bumps [browserslist](https://github.com/browserslist/browserslist) from 4.13.0 to 4.16.6. - [Release notes](https://github.com/browserslist/browserslist/releases) - [Changelog](https://github.com/browserslist/browserslist/blob/main/CHANGELOG.md) - [Commits](browserslist/browserslist@4.13.0...4.16.6) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * bump version in develop to 0.8.0 (#413) * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. * [GH-1350] fix Terra Workspace sink validation (#427) RR: https://broadinstitute.atlassian.net/browse/GH-1350 The "Terra Workspace" sink validation did not validate that the attributes listed in fromOutputs exist in the specified entity type in the workspace. This PR fixes this. * Bump dns-packet from 1.3.1 to 1.3.4 in /ui Bumps [dns-packet](https://github.com/mafintosh/dns-packet) from 1.3.1 to 1.3.4. - [Release notes](https://github.com/mafintosh/dns-packet/releases) - [Changelog](https://github.com/mafintosh/dns-packet/blob/master/CHANGELOG.md) - [Commits](mafintosh/dns-packet@v1.3.1...v1.3.4) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1324] Get or import snapshot (#422) Within rawls namespace: - get-snapshot-references: lazily paginate through all snapshot references in a workspace. - create-or-get-snapshot-reference: if importing the snapshot fails with a 409, find the first existing snapshot reference matching the snapshot id. * [GH-1354] Enforce `snapshotReaders` are email addresses (#428) RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo. * [GH-1356] Fix Coercion Failure for TDR Snapshots Source (#429) RR: https://broadinstitute.atlassian.net/browse/GH-1356 Caused by metosin/reitit#494 Fix as per https://broadinstitute.atlassian.net/browse/GH-1348 Included integration test for reitit coercions to catch this sooner. * [GH-1357] Fix failure to sink workflow outputs (#431) RR: https://broadinstitute.atlassian.net/browse/GH-1357 I assumed (wrongly) the way Rawls represented workflows in a submission was the same as how cromwell represented workflows. I also mocked these representations incorrectly in tests. To fix this, only use the submission to fetch workflow IDs, then use firecloud's workflow and workflow/outputs endpoints so we don't have to adapt/mock multiple data models. * [GH-1359] Overwrite Workspace Entity On Sink (#432) RR: https://broadinstitute.atlassian.net/browse/GH-1359 When sinking a resubmitted workflow, the TerraWorkspaceSink would append to the existing entity in the workspace. This leaves the entity in an even more wrong state than the reason the sample was reanalysed. Our fix (as agreed with cloreth) is to clobber the entity in the workspace by deleting the entity if it exists then upserting the new one. * Fix broken requirement install command in docs readme (#434) * Move docstring and document when true. (#436) The PR check fails in the test that Ed excludes on another branch. * [GH-1363] Fix Sporadic Firecloud Test Failures (#437) RR: https://broadinstitute.atlassian.net/browse/GH-1363 Caused by firecloud returning "Launching" as a workflow status - we were testing that the status was in #{"Queued" "Submitted"}. Fix util/poll so logical FALSE can be returned from the action. * Bump ws from 6.2.1 to 6.2.2 in /ui (#433) Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/commits) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * GH-1375: Fix AoU nightly test: GET /api/v1/workload fails with spec error from invalid creator (#442) * Don't spec the batch creator as an email address. * update changelog for v0.7.0 (#441) * GH-1332 wfl.tools.workloads/when-done should not exit prematurely (#438) If checking that all workloads are finished, also check that the workload list is nonempty. Fixed incorrect reference to static workload object from workload creation: we instead want to examine the latest version of the workload. * update changelog for v0.7.1 (#445) * Removing the arrays module and the corresponding integration test (#446) * Removing the arrays module and the corresponding integraiton test * Removed references to the deleted arrays module. Also ran all of the integration tests and verified that they pass * Updating the doc for the aou arrays module to have the appropriate name Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1376] Moved generic pipeline processing interfaces from covid -> stage ns (#444) * [GH-1388] Remove `pipeline-versions` from `GET /version` RR: https://broadinstitute.atlassian.net/browse/GH-1388 Inline the `version` key and add a `spec` for the response from the endpoint. Added system tests. * [GH-1389] Delete `^:excluded` arrays system tests RR: https://broadinstitute.atlassian.net/browse/GH-1389 These tests were missed in #446 * Resolving NPM security vulnerabilities (#448) * Fixing dependabot reported security issues as well as build reported npm audit security issues * Fixing lib for is-glob. Part of UI npm packages * Some updates. Some left to do * adding package.lock * Fix by using npm update. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Rex <rexwangcc@gmail.com> * GH-1371: Lint more. (#439) * Reduce noise while kibit'zing. * Add the Eastwood cLinter. * cLint Eastwood api/src and dampen noise. * Lint tests too. * Squash bugs and squelch noise. * Fix tests now that they "work". * nil-if-empty -=> not-empty * Disable some eastwood warnings. * :staus -=> :status * Suppress more warnings. * Expand namespace aliases in keywords. * Update Kondo. * LINT -=> FORMAT * Continue on LINT errors. * Checkpoint some kondo advice. * dotimes -=> repeatedly * Revert "Expand namespace aliases in keywords." * Build UserException to support Eastwood. * repeatedly -=> dotimes * Restore code deleted for debugging. * Work around: unused-fn-args: Function arg p2__11676# never used * Maybe explain foldl better. * [GH-1294] Migrate to use datasets in prod TDR (#453) * Update default values. * Use a dataset in production for integration tests. * Lint. * Remove unused ENV VAR WFL_TDR_SA. * Address comments. * Add sleep to avoid the transient 404s from TDR. * GH-1346 Document workload Executor and Terra implementation (#451) * [GH-1345] Document the workload `Source` and Implementations (#450) RR: https://broadinstitute.atlassian.net/browse/GH-1345 Implementations include: - `Terra DataRepo` Source - `TDR Snapshots` Source * Clean up my testing code... (#456) * Phoned a friend @tbl3rd to fix return type hint in executor docs (#457) * [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation (#449) [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation RR: https://broadinstitute.atlassian.net/browse/GH-1347 * Move `source` and `sink` Interfaces and Implementations into new namespace (#454) * [GH-1334] Move `source` Interface and Implementation into new namespace RR: https://broadinstitute.atlassian.net/browse/GH-1334 Start to move tests too. * [GH-1335] Move `Executor` Code to new Namespace RR: https://broadinstitute.atlassian.net/browse/GH-1335 Doc-strings from #451 * [GH-1408] Don't modify source code on build Generate UserException on prebuild Make linters depend on prebuild. They're run as part of the PR action and can be run in parallel locally. Remove formatting target - use deps for that. * [GH-1393]: Add `/retry` Endpoint (#452) * [GH-1393]: Add `/retry` Endpoint (#452) RR: https://broadinstitute.atlassian.net/browse/GH-1393 Add `POST /api/v1/workload/{uuid}/retry` route, taking the status as the request json body. Add `retry` multimethod to the workload interface. This operation takes the workload as well as as list of workflows to retry. Add `workflows-by-status` multimethod to take a status to filter by. No workloads support `retry` as yet so all implementations throw a 501. Added system and integration tests to lock down this behaviour. * Add test for [GH-1385] RR: https://broadinstitute.atlassian.net/browse/GH-1385 * fix build failures in develop (#458) * Documenting a Workload (#430) * rebasing off of newest develop. Adding initial documentation for documentation of a workload. * Adding covid workload module * adding -m flag to python pip command * Added covid-module to the mkdocs.yaml Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> * Changing order of executor block in docs (#459) Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * GH-1326 Move Sink code to new Namespace (#463) * Split specs into their specific modules (#460) * GH-1289 split out covid specs * GH-1289 moved more of the module specific specs to the modules * couple of fixes * move tdr source to source.clj * [GH-1398] Fix `test-start-aou-workload` from polling forever (#462) RR: https://broadinstitute.atlassian.net/browse/GH-1398 Split `workloads/when-done` into two functions: - one that polls until the workload is :finished - another that polls all the workflows Add an assertion to the automation-test * [GH-1401] Add `retry` Attribute to TerraExecutorDetails Type and Instances Thereof (#461) * add lint to the check target * add liquibase changelog to add `retry` to terraexecutordetails * don't return retried workflows form the executor * add a long comment about returning workflows * Fix some namespace issues (#464) * [GH-1409] Add Status Query String to workflows GET request (#466) * GH-1409 Add Status Query String to workflows GET request * Bring cromwell statuses up to date * add test for workflows by status endpoint * [GH-1409] fix some linting errors * Add On Hold back in, trim slashes for endpoints * change to not= * split statuses into rawls workspace * linters hate spaces * Minor changes * Remove trim-slashes and fix endpoints in wfl.tools namespace (#469) * [GH-1411] Describe Workflow Outputs (#471) RR: https://broadinstitute.atlassian.net/browse/GH-1411 Inform the Sink downstream of the executor of the workflow type. When sinking outputs to the TDR, you need to ingest the file outputs followed by the metadata with the files replaced by their fileref URIs. Thus, we need a way of determining if a particular output is a File that needs ingesting or just a metadata String that happens to be a gs url. I've had some luck in the past using womtool to describe the WDL and then walk the description of the outputs, dispatching to the relevant handlers (see workflows.clj in the test tools). For lack of a better alternative, I'm going to pursue using womtool to handle outputs in a generic way. Implemented by changing the source and executors to be queues of `[description value]` pairs. In the case of the TDR sources, the description is a qualified keyword describing what the object is (ie. a snapshot). For the terra executor, the description is the workflow description as reported by womtool and the value is the workflow. * Logging added to source namespace (#475) And downgraded level for several noisy logs. * GH-1414 Executor falls back to fetching existing snapshot reference (#477) * [GH-1139] Added new log namespace (#470) * Added new log namespace * Some updates to log namespace * Replace references to clojure.tools.logging with new log namespace * Remove trace logs * add logger protocol to support disabling, set up unit tests * Remove last remaining references to clojure.tools.logging and update documentation * fix linter error * Remove log4j and slf4j dependencies * Keep JSON logs out of EDN description files. * fix for logs getting into workflow edn files * Not sure how this ever worked though. * Fuss. * Simplify. * Restore docstrings. * Force build to suckseed. * Simplify. * Ensure do-or-nil returns nil. * Break up long lines. * change jsonpayload to message and remove obsolete logging tests * reverted some changes back * update docstrings and convert some str's to joins * added notice to documentation * added some documentation for looking up logs locally Co-authored-by: Tom Lyons <tbl3rd@gmail.com> * Remove error txt file (#479) * GH-1417: Remove the "skipped" nil UUID Cromwell status hack. (#393) * Remove the "skipped" nil UUID Cromwell status hack. * Patch up the rebase. * Fix rebase conflict. * Patch up rebase. * The status-counts fn is an old Zeroism. * Update docstrings. * Remove duplicate batch/workload-request spec. * Use the batch-namespaced spec. * Glean some lint that collected. * [GH-1419] Terra Executor queue length should consider workflows with null status (#481) Added new integration test. Also in existing integration tests, corrected faulty assumption in mocking: Firecloud returns differently formatted workflows depending on whether they are fetched as part of a submission fetch, or fetched individually. * GH-1397: Sweep and snapshot dataset row IDs that were missed when polling a dataset (#468) * Document Swagger validation of API specs. * Mock snapshots missing TDR row IDs. * Mock datarepo/query-table-between instead of source/find-new-rows. * Verify that rows are shared across updates. * Work around deprecation warning. * Do not call start-source! twice. * Goodby Fibonacci. * Simplify. * Add earliest and latest and clean up. * Make find-new-rows work with combine-tdr-source-details. * Run `clojure -M:format` in this directory * Glean more lint. * Should probably count rows in running snapshots too. * Respond to more Ed comments. * [GH-1139] Logging level and configuration (#480) * Logging level and configuration * unit testing for logging levels * few fixes/pr changes * use prepared statement * GH-1395: Document the new WFL retry capability for users. (#483) * The /retry endpoint is not implemented. * Address some comments from Ed. * Add a note from OK. * [GH-1394] TerraExecutor retry implementation and Sarscov2IlluminaFull integration * [GH-1395] GitHub Pages updates: COVID, staged workflows, retry functionality (#489) * Bump path-parse from 1.0.6 to 1.0.7 in /ui (#491) * Mkdocs updates that improve the doc nav and security holes. (#492) * [GH-1402] Write Workflow Outputs to the Terra Data Repository (#474) * [GH-1405] Add Schema for TerraDataRepoSink (#473) [GH-1405] Add Schema for TerraDataRepoSink RR: https://broadinstitute.atlassian.net/browse/GH-1405 Add schemas for the following types - TDRJobStatus - TDRJobType - TerraDataRepoSinkDetails and the TerraDataRepoSink table. * [GH-1314] Start Implementing DataRepo Sink (#476) RR: https://broadinstitute.atlassian.net/browse/GH-1413 Add load/create functions for the data repo sink - leaving update as "unimplemented". * [GH-1415] Implement `update-sink` for TerraDataRepoSink (#478) RR: https://broadinstitute.atlassian.net/browse/GH-1415 This change contains a rough implementation of the TerraDataRepo sink update functionality. After discovering certain hurdles in the original design, I've made some tweaks and incorporated TDR's work https://broadworkbench.atlassian.net/browse/DR-1960 so that we only need one ingest request instead of separately loading the output files and output metadata. At the time of writing, DR-1960 is work in progress so our ingests won't work just yet. Some things to call out in this change that I'm deferring to a follow up change into the feature branch: I need to upload the json file for TDR to ingest. I'm currently using our test outputs bucket as a temporary folder for this which is obviously not a production-ready solution. Possible fixes include creating a scratch bucket for workflow-launcher, creating a bucket per workload, using the executor's execution bucket... etc. Suggestions welcome. Since the ingests always fail, I'm leaving some of the work post-ingest as a TODO. * [GH-1425] Expose `TerraDataRepoSink` Via the HTTP API (#484) RR: https://broadinstitute.atlassian.net/browse/GH-1425 Add specs and register in wfl.api/spec. Add end-to-end system test for reading/writing workflow inputs/outputs to TDR (used the illumina_genotyping_array pipeline as it's smaller than sarscov2_illumina_full). Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * [GH-1416] Document DataRepo Sink (#485) RR: https://broadinstitute.atlassian.net/browse/GH-1416 Add section to `sink.md` describing how to configure workflow-launcher to write outputs back to a terra datarepo dataset. * [GH-1430] Remove `stage/validate-or-throw` and inline their implementations into `create-X`. (#490) RR: https://broadinstitute.atlassian.net/browse/GH-1430 Remove stage/validate-or-throw and inline their implementations in their respective create-X functions. This is done because - we were losing context (ie. was this a source/sink/executor) and had a collision in the tdr sink and source impls - these multimethods didn't really make sense and require some redesign work * address @okotsopoulos's feedback Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1328 Migrate to new Rawls Snapshot V2 endpoints (#493) * ENG-1394 Update post-retry error maps per QA feedback (#494) * Fixing documentation merge conflict * GH-1454 Update covid and executor integration tests to reflect updated method configuration version (#504) * Updated the Changelog for the release * Update of the changelog and version file * Adding changes from last release to changelog.md Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: Edmund Higham <edhigham@gmail.com> Co-authored-by: rfricke-asymmetrik <75337761+rfricke-asymmetrik@users.noreply.github.com>
rhiananthony
added a commit
that referenced
this pull request
Oct 8, 2021
* Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * bump develop to 0.7.0 (#360) * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md in develop (#362) cherry-pick from main * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Merge v0.6.1 into develop (#366) * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](https://github.com/npm/ssri/compare/v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](https://github.com/unshiftio/url-parse/compare/1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * Bump browserslist from 4.13.0 to 4.16.6 in /ui Bumps [browserslist](https://github.com/browserslist/browserslist) from 4.13.0 to 4.16.6. - [Release notes](https://github.com/browserslist/browserslist/releases) - [Changelog](https://github.com/browserslist/browserslist/blob/main/CHANGELOG.md) - [Commits](https://github.com/browserslist/browserslist/compare/4.13.0...4.16.6) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * bump version in develop to 0.8.0 (#413) * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. * [GH-1350] fix Terra Workspace sink validation (#427) RR: https://broadinstitute.atlassian.net/browse/GH-1350 The "Terra Workspace" sink validation did not validate that the attributes listed in fromOutputs exist in the specified entity type in the workspace. This PR fixes this. * Bump dns-packet from 1.3.1 to 1.3.4 in /ui Bumps [dns-packet](https://github.com/mafintosh/dns-packet) from 1.3.1 to 1.3.4. - [Release notes](https://github.com/mafintosh/dns-packet/releases) - [Changelog](https://github.com/mafintosh/dns-packet/blob/master/CHANGELOG.md) - [Commits](https://github.com/mafintosh/dns-packet/compare/v1.3.1...v1.3.4) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1324] Get or import snapshot (#422) Within rawls namespace: - get-snapshot-references: lazily paginate through all snapshot references in a workspace. - create-or-get-snapshot-reference: if importing the snapshot fails with a 409, find the first existing snapshot reference matching the snapshot id. * [GH-1354] Enforce `snapshotReaders` are email addresses (#428) RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo. * [GH-1356] Fix Coercion Failure for TDR Snapshots Source (#429) RR: https://broadinstitute.atlassian.net/browse/GH-1356 Caused by https://github.com/metosin/reitit/issues/494 Fix as per https://broadinstitute.atlassian.net/browse/GH-1348 Included integration test for reitit coercions to catch this sooner. * [GH-1357] Fix failure to sink workflow outputs (#431) RR: https://broadinstitute.atlassian.net/browse/GH-1357 I assumed (wrongly) the way Rawls represented workflows in a submission was the same as how cromwell represented workflows. I also mocked these representations incorrectly in tests. To fix this, only use the submission to fetch workflow IDs, then use firecloud's workflow and workflow/outputs endpoints so we don't have to adapt/mock multiple data models. * [GH-1359] Overwrite Workspace Entity On Sink (#432) RR: https://broadinstitute.atlassian.net/browse/GH-1359 When sinking a resubmitted workflow, the TerraWorkspaceSink would append to the existing entity in the workspace. This leaves the entity in an even more wrong state than the reason the sample was reanalysed. Our fix (as agreed with cloreth) is to clobber the entity in the workspace by deleting the entity if it exists then upserting the new one. * Fix broken requirement install command in docs readme (#434) * Move docstring and document when true. (#436) The PR check fails in the test that Ed excludes on another branch. * [GH-1363] Fix Sporadic Firecloud Test Failures (#437) RR: https://broadinstitute.atlassian.net/browse/GH-1363 Caused by firecloud returning "Launching" as a workflow status - we were testing that the status was in #{"Queued" "Submitted"}. Fix util/poll so logical FALSE can be returned from the action. * Bump ws from 6.2.1 to 6.2.2 in /ui (#433) Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/commits) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * GH-1375: Fix AoU nightly test: GET /api/v1/workload fails with spec error from invalid creator (#442) * Don't spec the batch creator as an email address. * update changelog for v0.7.0 (#441) * GH-1332 wfl.tools.workloads/when-done should not exit prematurely (#438) If checking that all workloads are finished, also check that the workload list is nonempty. Fixed incorrect reference to static workload object from workload creation: we instead want to examine the latest version of the workload. * update changelog for v0.7.1 (#445) * Removing the arrays module and the corresponding integration test (#446) * Removing the arrays module and the corresponding integraiton test * Removed references to the deleted arrays module. Also ran all of the integration tests and verified that they pass * Updating the doc for the aou arrays module to have the appropriate name Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1376] Moved generic pipeline processing interfaces from covid -> stage ns (#444) * [GH-1388] Remove `pipeline-versions` from `GET /version` RR: https://broadinstitute.atlassian.net/browse/GH-1388 Inline the `version` key and add a `spec` for the response from the endpoint. Added system tests. * [GH-1389] Delete `^:excluded` arrays system tests RR: https://broadinstitute.atlassian.net/browse/GH-1389 These tests were missed in https://github.com/broadinstitute/wfl/pull/446 * Resolving NPM security vulnerabilities (#448) * Fixing dependabot reported security issues as well as build reported npm audit security issues * Fixing lib for is-glob. Part of UI npm packages * Some updates. Some left to do * adding package.lock * Fix by using npm update. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Rex <rexwangcc@gmail.com> * GH-1371: Lint more. (#439) * Reduce noise while kibit'zing. * Add the Eastwood cLinter. * cLint Eastwood api/src and dampen noise. * Lint tests too. * Squash bugs and squelch noise. * Fix tests now that they "work". * nil-if-empty -=> not-empty * Disable some eastwood warnings. * :staus -=> :status * Suppress more warnings. * Expand namespace aliases in keywords. * Update Kondo. * LINT -=> FORMAT * Continue on LINT errors. * Checkpoint some kondo advice. * dotimes -=> repeatedly * Revert "Expand namespace aliases in keywords." * Build UserException to support Eastwood. * repeatedly -=> dotimes * Restore code deleted for debugging. * Work around: unused-fn-args: Function arg p2__11676# never used * Maybe explain foldl better. * [GH-1294] Migrate to use datasets in prod TDR (#453) * Update default values. * Use a dataset in production for integration tests. * Lint. * Remove unused ENV VAR WFL_TDR_SA. * Address comments. * Add sleep to avoid the transient 404s from TDR. * GH-1346 Document workload Executor and Terra implementation (#451) * [GH-1345] Document the workload `Source` and Implementations (#450) RR: https://broadinstitute.atlassian.net/browse/GH-1345 Implementations include: - `Terra DataRepo` Source - `TDR Snapshots` Source * Clean up my testing code... (#456) * Phoned a friend @tbl3rd to fix return type hint in executor docs (#457) * [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation (#449) [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation RR: https://broadinstitute.atlassian.net/browse/GH-1347 * Move `source` and `sink` Interfaces and Implementations into new namespace (#454) * [GH-1334] Move `source` Interface and Implementation into new namespace RR: https://broadinstitute.atlassian.net/browse/GH-1334 Start to move tests too. * [GH-1335] Move `Executor` Code to new Namespace RR: https://broadinstitute.atlassian.net/browse/GH-1335 Doc-strings from https://github.com/broadinstitute/wfl/pull/451 * [GH-1408] Don't modify source code on build Generate UserException on prebuild Make linters depend on prebuild. They're run as part of the PR action and can be run in parallel locally. Remove formatting target - use deps for that. * [GH-1393]: Add `/retry` Endpoint (#452) * [GH-1393]: Add `/retry` Endpoint (#452) RR: https://broadinstitute.atlassian.net/browse/GH-1393 Add `POST /api/v1/workload/{uuid}/retry` route, taking the status as the request json body. Add `retry` multimethod to the workload interface. This operation takes the workload as well as as list of workflows to retry. Add `workflows-by-status` multimethod to take a status to filter by. No workloads support `retry` as yet so all implementations throw a 501. Added system and integration tests to lock down this behaviour. * Add test for [GH-1385] RR: https://broadinstitute.atlassian.net/browse/GH-1385 * fix build failures in develop (#458) * Documenting a Workload (#430) * rebasing off of newest develop. Adding initial documentation for documentation of a workload. * Adding covid workload module * adding -m flag to python pip command * Added covid-module to the mkdocs.yaml Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> * Changing order of executor block in docs (#459) Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * GH-1326 Move Sink code to new Namespace (#463) * Split specs into their specific modules (#460) * GH-1289 split out covid specs * GH-1289 moved more of the module specific specs to the modules * couple of fixes * move tdr source to source.clj * [GH-1398] Fix `test-start-aou-workload` from polling forever (#462) RR: https://broadinstitute.atlassian.net/browse/GH-1398 Split `workloads/when-done` into two functions: - one that polls until the workload is :finished - another that polls all the workflows Add an assertion to the automation-test * [GH-1401] Add `retry` Attribute to TerraExecutorDetails Type and Instances Thereof (#461) * add lint to the check target * add liquibase changelog to add `retry` to terraexecutordetails * don't return retried workflows form the executor * add a long comment about returning workflows * Fix some namespace issues (#464) * [GH-1409] Add Status Query String to workflows GET request (#466) * GH-1409 Add Status Query String to workflows GET request * Bring cromwell statuses up to date * add test for workflows by status endpoint * [GH-1409] fix some linting errors * Add On Hold back in, trim slashes for endpoints * change to not= * split statuses into rawls workspace * linters hate spaces * Minor changes * Remove trim-slashes and fix endpoints in wfl.tools namespace (#469) * [GH-1411] Describe Workflow Outputs (#471) RR: https://broadinstitute.atlassian.net/browse/GH-1411 Inform the Sink downstream of the executor of the workflow type. When sinking outputs to the TDR, you need to ingest the file outputs followed by the metadata with the files replaced by their fileref URIs. Thus, we need a way of determining if a particular output is a File that needs ingesting or just a metadata String that happens to be a gs url. I've had some luck in the past using womtool to describe the WDL and then walk the description of the outputs, dispatching to the relevant handlers (see workflows.clj in the test tools). For lack of a better alternative, I'm going to pursue using womtool to handle outputs in a generic way. Implemented by changing the source and executors to be queues of `[description value]` pairs. In the case of the TDR sources, the description is a qualified keyword describing what the object is (ie. a snapshot). For the terra executor, the description is the workflow description as reported by womtool and the value is the workflow. * Logging added to source namespace (#475) And downgraded level for several noisy logs. * GH-1414 Executor falls back to fetching existing snapshot reference (#477) * [GH-1139] Added new log namespace (#470) * Added new log namespace * Some updates to log namespace * Replace references to clojure.tools.logging with new log namespace * Remove trace logs * add logger protocol to support disabling, set up unit tests * Remove last remaining references to clojure.tools.logging and update documentation * fix linter error * Remove log4j and slf4j dependencies * Keep JSON logs out of EDN description files. * fix for logs getting into workflow edn files * Not sure how this ever worked though. * Fuss. * Simplify. * Restore docstrings. * Force build to suckseed. * Simplify. * Ensure do-or-nil returns nil. * Break up long lines. * change jsonpayload to message and remove obsolete logging tests * reverted some changes back * update docstrings and convert some str's to joins * added notice to documentation * added some documentation for looking up logs locally Co-authored-by: Tom Lyons <tbl3rd@gmail.com> * Remove error txt file (#479) * GH-1417: Remove the "skipped" nil UUID Cromwell status hack. (#393) * Remove the "skipped" nil UUID Cromwell status hack. * Patch up the rebase. * Fix rebase conflict. * Patch up rebase. * The status-counts fn is an old Zeroism. * Update docstrings. * Remove duplicate batch/workload-request spec. * Use the batch-namespaced spec. * Glean some lint that collected. * [GH-1419] Terra Executor queue length should consider workflows with null status (#481) Added new integration test. Also in existing integration tests, corrected faulty assumption in mocking: Firecloud returns differently formatted workflows depending on whether they are fetched as part of a submission fetch, or fetched individually. * GH-1397: Sweep and snapshot dataset row IDs that were missed when polling a dataset (#468) * Document Swagger validation of API specs. * Mock snapshots missing TDR row IDs. * Mock datarepo/query-table-between instead of source/find-new-rows. * Verify that rows are shared across updates. * Work around deprecation warning. * Do not call start-source! twice. * Goodby Fibonacci. * Simplify. * Add earliest and latest and clean up. * Make find-new-rows work with combine-tdr-source-details. * Run `clojure -M:format` in this directory * Glean more lint. * Should probably count rows in running snapshots too. * Respond to more Ed comments. * [GH-1139] Logging level and configuration (#480) * Logging level and configuration * unit testing for logging levels * few fixes/pr changes * use prepared statement * GH-1395: Document the new WFL retry capability for users. (#483) * The /retry endpoint is not implemented. * Address some comments from Ed. * Add a note from OK. * [GH-1394] TerraExecutor retry implementation and Sarscov2IlluminaFull integration * [GH-1395] GitHub Pages updates: COVID, staged workflows, retry functionality (#489) * Bump path-parse from 1.0.6 to 1.0.7 in /ui (#491) * Mkdocs updates that improve the doc nav and security holes. (#492) * [GH-1402] Write Workflow Outputs to the Terra Data Repository (#474) * [GH-1405] Add Schema for TerraDataRepoSink (#473) [GH-1405] Add Schema for TerraDataRepoSink RR: https://broadinstitute.atlassian.net/browse/GH-1405 Add schemas for the following types - TDRJobStatus - TDRJobType - TerraDataRepoSinkDetails and the TerraDataRepoSink table. * [GH-1314] Start Implementing DataRepo Sink (#476) RR: https://broadinstitute.atlassian.net/browse/GH-1413 Add load/create functions for the data repo sink - leaving update as "unimplemented". * [GH-1415] Implement `update-sink` for TerraDataRepoSink (#478) RR: https://broadinstitute.atlassian.net/browse/GH-1415 This change contains a rough implementation of the TerraDataRepo sink update functionality. After discovering certain hurdles in the original design, I've made some tweaks and incorporated TDR's work https://broadworkbench.atlassian.net/browse/DR-1960 so that we only need one ingest request instead of separately loading the output files and output metadata. At the time of writing, DR-1960 is work in progress so our ingests won't work just yet. Some things to call out in this change that I'm deferring to a follow up change into the feature branch: I need to upload the json file for TDR to ingest. I'm currently using our test outputs bucket as a temporary folder for this which is obviously not a production-ready solution. Possible fixes include creating a scratch bucket for workflow-launcher, creating a bucket per workload, using the executor's execution bucket... etc. Suggestions welcome. Since the ingests always fail, I'm leaving some of the work post-ingest as a TODO. * [GH-1425] Expose `TerraDataRepoSink` Via the HTTP API (#484) RR: https://broadinstitute.atlassian.net/browse/GH-1425 Add specs and register in wfl.api/spec. Add end-to-end system test for reading/writing workflow inputs/outputs to TDR (used the illumina_genotyping_array pipeline as it's smaller than sarscov2_illumina_full). Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * [GH-1416] Document DataRepo Sink (#485) RR: https://broadinstitute.atlassian.net/browse/GH-1416 Add section to `sink.md` describing how to configure workflow-launcher to write outputs back to a terra datarepo dataset. * [GH-1430] Remove `stage/validate-or-throw` and inline their implementations into `create-X`. (#490) RR: https://broadinstitute.atlassian.net/browse/GH-1430 Remove stage/validate-or-throw and inline their implementations in their respective create-X functions. This is done because - we were losing context (ie. was this a source/sink/executor) and had a collision in the tdr sink and source impls - these multimethods didn't really make sense and require some redesign work * address @okotsopoulos's feedback Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1328 Migrate to new Rawls Snapshot V2 endpoints (#493) * ENG-1394 Update post-retry error maps per QA feedback (#494) * [GH-1303] Google Cloud Logging Alerts (#486) * Documentation for monitoring alerts and error response * Fix lint error and add md file to yml * [GH-1418] sourceLocation null fix (#487) * initial sourcelocation log fix * Update jdbc macros to include line number * Github actions should build docs on develop. (#495) * [GH-1301] Add Slack watcher support for user exceptions. (#467) * Add a Slack module. * Thanks to the linter. * Notify Slack channels UserExceptions. * Commit the consensus summary from mobbing. * Agent, WIP. * Resolve merge conflict. * Checkpoint. * Update the add-notification api. * Use a persistentqueue for notification agent. * Lint. * Commit tests. * CLean up. * Lint. * Linnnnnnt. * DB changes. * Update existing docs. * Resolve comments. * Add watcher spec. * Add watchers on all workload creations. * Remove comment blocks and fix unit tests. * Feature complete. * Attempt to fix the test. * Tests work now! * Lint. * Move deserialization to the top level. * deref should have a boundary. * Further separate the logic. * Lint. * Resolve comments. * Lint. * Bump the time out for slack integration tests. * Make watchers [SlackChannel ChannelId] | [EmailAddress EmailAddress] | EmailAddress. * Add a debugging line for slack test. * Mark the test as pending. * In fact we don't need 2 dimentional arrays for watchers. * Redo serialization and de-serialization, so it's more clear and secure. * Lint. * Thanks @tbl3rd. Address comments. * DB migration. * Thanks to the team, finally have a rough consensus. * Lint. * Update docs. * Our test caught a real issue! * Add a nightly system test against dev WFL (#498) * Make system test more flexible when kick off through make. * Add a nightly system test. * Add a nightly target for make and run it in Github Actions. * Typo. * oops. * Add a badge for WFL nightly test. (#500) * Remove the UI component of WFL. (#499) * Remove the ui dir and make module references and ci refs. * bump version of reitit. * Add back the swagger page to API. * Add back ui dir but only keep the proxy image. * [GH-1353] Fix spec error when skipValidation sent (#497) * Fix spec error when skipValidation sent * fix lint errors * revert other fix, save source with a dataset edn if skipvalidation true * [GH-1441] Stop swallowing UserExceptions (#501) * userexception initial * actually return userexceptions and log them as warnings * small changes * GH-1439 Succeeded workflow status should be permissible to retry (#503) * GH-1454 Update covid and executor integration tests to reflect updated method configuration version (#504) * GH-1450 Add retry attribute to TerraDataRepoSource type and instances (#505) * GH-1446: fix system tests again (#502) * Port changes from old review and debug branch. * 1 file(s) formatted incorrectly * Examine wedged workloads. * I dropped the database. * Commit debug hacks so line numbers agree with logs. * Move trace to flag only updating workloads. * Fix when-all-workflows-finish. * Checkpoint ingest-illumina-genotyping-array-files. * Checkpoint BQ interface. * Still: Can not delete a dataset being used by snapshots * Handle another dataset-or-snapshot case. * Work around {:defaultSnapshotId nil}. * Fix dataset fixture. * Don't fail test-workflows-by-status when no workloads. * Try using wfl-dev instead of general-dev-billing-account. * Revert "Try using wfl-dev instead of general-dev-billing-account." * Keep non-null inputs and options from "Failed" workflows. (#508) * GH-1462: Make :watchers key always optional. (#515) * Make :watchers key always optional. * Mention that the watchers field is optional in docs. * Bump AoU Arrays.wdl version to 2.4.1 to drop default (#514) genotype_concordance_threshold to 0.95. * GH-1444 TerraDataRepoSource should not snapshot more frequently than every 20 min (#513) Also back to updating last_checked any time we poll TDR, even if it did not find new rows. * Updating the version for the develop branch to 0.9.0 (#517) Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * GH-1373 rename removeSucceeded in retry runbook (#519) Since we are really keeping the failed workflows. * GH-1476 Log failed snapshot creation job's metadata and result (#518) Gracefully handle when datarepo/job-result throws ... which happens if the snapshot creation job failed due to dataset locking, for ex. Add unit tests for new function source/result-or-catch. * Trial merge of main to develop. * Fix merge conflicts around logging in server.clj. * Generate new CHANGELOG.md for 0.9.0. * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes * Merge fixes Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: Edmund Higham <edhigham@gmail.com> Co-authored-by: rfricke-asymmetrik <75337761+rfricke-asymmetrik@users.noreply.github.com>
rfricke-asymmetrik
added a commit
that referenced
this pull request
Nov 5, 2021
* v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * release 0.7.0 - SARSCov2 Illumina Full support In Terra (#440) * Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * bump develop to 0.7.0 (#360) * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](https://github.com/npm/ssri/compare/v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](https://github.com/unshiftio/url-parse/compare/1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. * [GH-1350] fix Terra Workspace sink validation (#427) RR: https://broadinstitute.atlassian.net/browse/GH-1350 The "Terra Workspace" sink validation did not validate that the attributes listed in fromOutputs exist in the specified entity type in the workspace. This PR fixes this. * Bump dns-packet from 1.3.1 to 1.3.4 in /ui Bumps [dns-packet](https://github.com/mafintosh/dns-packet) from 1.3.1 to 1.3.4. - [Release notes](https://github.com/mafintosh/dns-packet/releases) - [Changelog](https://github.com/mafintosh/dns-packet/blob/master/CHANGELOG.md) - [Commits](https://github.com/mafintosh/dns-packet/compare/v1.3.1...v1.3.4) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1325] Get or import snapshot (#422) Within rawls namespace: - get-snapshot-references: lazily paginate through all snapshot references in a workspace. - create-or-get-snapshot-reference: if importing the snapshot fails with a 409, find the first existing snapshot reference matching the snapshot id. * [GH-1354] Enforce `snapshotReaders` are email addresses (#428) RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo. * [GH-1356] Fix Coercion Failure for TDR Snapshots Source (#429) RR: https://broadinstitute.atlassian.net/browse/GH-1356 Caused by https://github.com/metosin/reitit/issues/494 Fix as per https://broadinstitute.atlassian.net/browse/GH-1348 Included integration test for reitit coercions to catch this sooner. * [GH-1357] Fix failure to sink workflow outputs (#431) RR: https://broadinstitute.atlassian.net/browse/GH-1357 I assumed (wrongly) the way Rawls represented workflows in a submission was the same as how cromwell represented workflows. I also mocked these representations incorrectly in tests. To fix this, only use the submission to fetch workflow IDs, then use firecloud's workflow and workflow/outputs endpoints so we don't have to adapt/mock multiple data models. * [GH-1359] Overwrite Workspace Entity On Sink (#432) RR: https://broadinstitute.atlassian.net/browse/GH-1359 When sinking a resubmitted workflow, the TerraWorkspaceSink would append to the existing entity in the workspace. This leaves the entity in an even more wrong state than the reason the sample was reanalysed. Our fix (as agreed with cloreth) is to clobber the entity in the workspace by deleting the entity if it exists then upserting the new one. * Fix broken requirement install command in docs readme (#434) * Move docstring and document when true. (#436) The PR check fails in the test that Ed excludes on another branch. * [GH-1361] Disable Failing Tests for v0.7.0 Release (#435) RR: https://broadinstitute.atlassian.net/browse/GH-1361 This is a low risk change for v0.7.0 as this test doess not exercise reachable product code. Disabling: - wfl.integration.modules.arrays-test/test-update-arrays-workload! * [GH-1363] Fix Sporadic Firecloud Test Failures (#437) RR: https://broadinstitute.atlassian.net/browse/GH-1363 Caused by firecloud returning "Launching" as a workflow status - we were testing that the status was in #{"Queued" "Submitted"}. Fix util/poll so logical FALSE can be returned from the action. * Bump ws from 6.2.1 to 6.2.2 in /ui (#433) Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/commits) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * update changelog for v0.7.0 Co-authored-by: Rhian Anthony <rhian.anthony@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Rex <rexwangcc@gmail.com> * release v0.7.1 - Bug Fixes (#443) * bump for v0.7.1 patch * GH-1375: Fix AoU nightly test: GET /api/v1/workload fails with spec error from invalid creator (#442) * Don't spec the batch creator as an email address. * update changelog for v0.7.1 Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release/0.8.0 rc (#506) * Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * bump develop to 0.7.0 (#360) * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md in develop (#362) cherry-pick from main * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Merge v0.6.1 into develop (#366) * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](https://github.com/npm/ssri/compare/v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](https://github.com/unshiftio/url-parse/compare/1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * Bump browserslist from 4.13.0 to 4.16.6 in /ui Bumps [browserslist](https://github.com/browserslist/browserslist) from 4.13.0 to 4.16.6. - [Release notes](https://github.com/browserslist/browserslist/releases) - [Changelog](https://github.com/browserslist/browserslist/blob/main/CHANGELOG.md) - [Commits](https://github.com/browserslist/browserslist/compare/4.13.0...4.16.6) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * bump version in develop to 0.8.0 (#413) * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. *…
okotsopoulos
added a commit
that referenced
this pull request
Nov 5, 2021
* v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * release 0.7.0 - SARSCov2 Illumina Full support In Terra (#440) * Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * bump develop to 0.7.0 (#360) * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](https://github.com/npm/ssri/compare/v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](https://github.com/unshiftio/url-parse/compare/1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. * [GH-1350] fix Terra Workspace sink validation (#427) RR: https://broadinstitute.atlassian.net/browse/GH-1350 The "Terra Workspace" sink validation did not validate that the attributes listed in fromOutputs exist in the specified entity type in the workspace. This PR fixes this. * Bump dns-packet from 1.3.1 to 1.3.4 in /ui Bumps [dns-packet](https://github.com/mafintosh/dns-packet) from 1.3.1 to 1.3.4. - [Release notes](https://github.com/mafintosh/dns-packet/releases) - [Changelog](https://github.com/mafintosh/dns-packet/blob/master/CHANGELOG.md) - [Commits](https://github.com/mafintosh/dns-packet/compare/v1.3.1...v1.3.4) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1325] Get or import snapshot (#422) Within rawls namespace: - get-snapshot-references: lazily paginate through all snapshot references in a workspace. - create-or-get-snapshot-reference: if importing the snapshot fails with a 409, find the first existing snapshot reference matching the snapshot id. * [GH-1354] Enforce `snapshotReaders` are email addresses (#428) RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo. * [GH-1356] Fix Coercion Failure for TDR Snapshots Source (#429) RR: https://broadinstitute.atlassian.net/browse/GH-1356 Caused by https://github.com/metosin/reitit/issues/494 Fix as per https://broadinstitute.atlassian.net/browse/GH-1348 Included integration test for reitit coercions to catch this sooner. * [GH-1357] Fix failure to sink workflow outputs (#431) RR: https://broadinstitute.atlassian.net/browse/GH-1357 I assumed (wrongly) the way Rawls represented workflows in a submission was the same as how cromwell represented workflows. I also mocked these representations incorrectly in tests. To fix this, only use the submission to fetch workflow IDs, then use firecloud's workflow and workflow/outputs endpoints so we don't have to adapt/mock multiple data models. * [GH-1359] Overwrite Workspace Entity On Sink (#432) RR: https://broadinstitute.atlassian.net/browse/GH-1359 When sinking a resubmitted workflow, the TerraWorkspaceSink would append to the existing entity in the workspace. This leaves the entity in an even more wrong state than the reason the sample was reanalysed. Our fix (as agreed with cloreth) is to clobber the entity in the workspace by deleting the entity if it exists then upserting the new one. * Fix broken requirement install command in docs readme (#434) * Move docstring and document when true. (#436) The PR check fails in the test that Ed excludes on another branch. * [GH-1361] Disable Failing Tests for v0.7.0 Release (#435) RR: https://broadinstitute.atlassian.net/browse/GH-1361 This is a low risk change for v0.7.0 as this test doess not exercise reachable product code. Disabling: - wfl.integration.modules.arrays-test/test-update-arrays-workload! * [GH-1363] Fix Sporadic Firecloud Test Failures (#437) RR: https://broadinstitute.atlassian.net/browse/GH-1363 Caused by firecloud returning "Launching" as a workflow status - we were testing that the status was in #{"Queued" "Submitted"}. Fix util/poll so logical FALSE can be returned from the action. * Bump ws from 6.2.1 to 6.2.2 in /ui (#433) Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/commits) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * update changelog for v0.7.0 Co-authored-by: Rhian Anthony <rhian.anthony@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Rex <rexwangcc@gmail.com> * release v0.7.1 - Bug Fixes (#443) * bump for v0.7.1 patch * GH-1375: Fix AoU nightly test: GET /api/v1/workload fails with spec error from invalid creator (#442) * Don't spec the batch creator as an email address. * update changelog for v0.7.1 Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release/0.8.0 rc (#506) * Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * bump develop to 0.7.0 (#360) * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md in develop (#362) cherry-pick from main * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Merge v0.6.1 into develop (#366) * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](https://github.com/npm/ssri/compare/v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](https://github.com/unshiftio/url-parse/compare/1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * Bump browserslist from 4.13.0 to 4.16.6 in /ui Bumps [browserslist](https://github.com/browserslist/browserslist) from 4.13.0 to 4.16.6. - [Release notes](https://github.com/browserslist/browserslist/releases) - [Changelog](https://github.com/browserslist/browserslist/blob/main/CHANGELOG.md) - [Commits](https://github.com/browserslist/browserslist/compare/4.13.0...4.16.6) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * bump version in develop to 0.8.0 (#413) * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. *…
tbl3rd
added a commit
that referenced
this pull request
Apr 7, 2022
* Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * bump develop to 0.7.0 (#360) * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md in develop (#362) cherry-pick from main * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Merge v0.6.1 into develop (#366) * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](https://github.com/npm/ssri/compare/v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](https://github.com/unshiftio/url-parse/compare/1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add support for loading and creating COVID workloads Add PR test hook for database/ directory Tweak database schema to make it a bit easier to create workloads * [GH-1313] TerraWorkspaceSink/update-sink! (#394) * Add implementation for TerraWorkspaceSink/update-sink! Rework temporary-postresql-database fixture to override wfl-db-config * integration test passes * add make-queue-from-list to be clearer * fix test * use max(id) + 1 for new output row instead of table length + 1 * [GH-1215] COVID TerraExecutor/update-executor! Added update-terra-executor and required helpers: - coerce an available snapshot from source queue to a snapshot reference - add entry point for submission creation - write new workflows to executor details instance Refined queue peek and pop for source and executor. Notably, queue peeks return a meaningful object (snapshot for source, workflow for executor) rather than a database record. Removed now-deprecated code from earlier update loop design. Added integration test. * [GH-1306] Add `workflows` multimethod for workloads (#396) RR: https://broadinstitute.atlassian.net/browse/GH-1306 Add workflows multimethod Change all locations where workflows are accessed via the :workflows keyword to use this function. Implement workflows for COVID I've added a simple test to make sure the SQL is correct Awaiting executor implementation to test with more than 0 workflows. * [GH-1222] Check for new work in TDR (#371) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * Remove outdated test code. * Remove unused utilities code. * Checkpoint. * Bump timeout for system tests, as it takes longer to finish now. * Update docs to include how to override envs for local testing. * Commit what I had before surgery. * Wrap up. * Add a protocol extension for psql array. * Update DB schema to store row ids for TDR. * Also store TDR row-ids and query interval info in DB. * Move protocol test to integration. * Update utc-now util also randomize the snapshot name with request datetime. * Resolve merge conflicts. * Fix weirdness during rebasing... * Move the logic into the multimethod interface. * Take out the utc-now util function. * Move protocol test to jdbc_test and reformat code. * Remove duplicate code. * Remove comment blocks. * Fix broken MD during rebasing. * Crazy formats. * Fix a bug with the snapshot naming. * Add the snapshot_id back to DB schema. * Include job-status in Source Details table. * update TDR jobs that are still running as part of source update loop. * Fix timestamp formatting. * Exclude the problematic test. * Fix a hiccup. * Fix formatting and a small issue with job-result from TDR. * Minor fixes around covid module. * Add unit and integration tests. * Lint. * Let DB control the primary key auto increment. * Fix the issues identified by the tests. * Fix an issue with the jdbc protocol test. * Update unit tests. * Call the multi-method in covid test to make sure everything is plumbed correctly. * Log when TDR snapshot creation job failed. * Address comments. * Increase the interval of system test polling. Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * [GH-1314] expose workflows via `GET /api/v1/workload/{uuid}/workflows` (#397) RR: https://broadinstitute.atlassian.net/browse/GH-1314 Adding a separate workflows selector will allow us to decouple the workload representation from serialisation artefacts that aren’t useful for a user. In this PR, I'm changing the workload data model such that `GET /api/v1/workloads` to only return workload metadata (everything but the workflows) and adding `GET /api/v1/workloads/{uuid}/workflows` to fetch the workflows in the workload. * [GH-1317] start the workload source (#399) RR: https://broadinstitute.atlassian.net/browse/GH-1317 The source of the workload needs to be told that the workload has been started so it can start looking in he datarepo for new data. I've added a new multi-method for the source to start it, called on start-workload!. This writes last_checked to be the time when we should start listening for new data. * TerraExecutor/update-executor should update active, failed workflow statuses (#398) 1. Within a single tx, fetch executor details workflow records with active or failed status. 2. For each record, use Firecloud to fetch up-to-date workflow object. Update the record with the corresponding new workflow status. 3. Within a single tx, write new record statuses to DB. Updated integration tests. * GH-1307 GH-1312 Update method configuration using snapshot reference (#402) In new method `update-method-configuration!`: 1. GET the method configuration from Firecloud 2. If the Firecloud-derived mc version doesn't match our DB record, throw 3. Update the method configuration with the snapshot reference name and POST to Firecloud 4. Increment our DB record's mc version Supplemented integration tests. * [GH-1322] TDR Snapshots Source (#403) RR: https://broadinstitute.atlassian.net/browse/GH-1322 Adding a new source to workflow-launcher that's a list of snapshots. * [GH-1318] Stop Workload Source (#400) RR: https://broadinstitute.atlassian.net/browse/GH-1318 Add `stop-source` multimethod and implement for the TerraDataRepoSource. When stopped, the source will stop looking for new rows to snaphot in the TDR. Also added logic for when the workload is finished. In this case, once started the workload is finished when it has been stopped and both source and executor queues are empty. To do this, I added another multimethod to return the length of the queues. For the executor, the length of the queue is the number of workflows that have not been consumed or aborted. For the source, the length of the queue is the number of unconsumed records whose snapshot job is not 'failed'. Note that this is slightly at odds with peek, where peek is meant to return objects that are ready to be consumed. * GH-1216: [TDR New Work Detection] Extend System Tests for COVID processing (#395) * Update vault to work around NoClassDefFoundError. * Checkpoint failing with PSQLException: FATAL: database ... does not exist database "wfltest147f1cbf976046e68e6beb5eef728674" does not exist * Spec validation still losing sink and source keys. * Update most dependencies. * Restore batch or covid request. * Checkpoint "working" spec.clj. * Lift strip-internals out of endpoint API transaction. * New system test passing again. * Use Ed's strip-internals fix. * clojure -M:format * Respond to comments but now there is no :name in response. * clojure -M:format * Make ::name optional for now. * [GH-1320] add `snapshotReaders` to tdr source (#405) RR: https://broadinstitute.atlassian.net/browse/GH-1320 Adding a list of people who want to access to the snapshots we create to the source request. * Remove liquibase-core from build dependencies. (#407) * [GH-1213] Create covid workload verifications (#379) * Update the specs. * Commit pair programming results. * Commit updated implementation (WIP). * organizing the keyworrds in spec.clj * It seems spec/or requires ky-value pairs! Fix it. * Added start of workflow-request for create-covid-workload function * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * We cannot use table as col name in SQL. * Checkpoint. * More. * Initial creat covid workload commit * Verifies dataset as well * Verifications of the source, executor and sink * More verification of sink and executor * Verification of source, sink, executor pre review * Fixing broken test and docstrings instead of comments * Update api/src/wfl/service/rawls.clj * Fix and rewrite the test. * Fix the test part II. * Adding more tests for the create-covid-workload and adding messages for the cause of a thrown exception * Updated tests and create covid workload verifications * removing commente3d out code * removing get-workspace function from rawls code since we use firecloud to make this call * Adding lint changes * split up tests. Added verification of dataset table and columns * Lint changes * Fixing merge conflict and lint issue * Fixing import * Updated to base off of develop changes * Adding validation to create workflow logic * Updates to test, spec and create method * Working validation tests * linted. Also removed old Liquibase files no longer used * cleanup of testing namespace * removing try/catch since in verify source function * Moving validation to top of function * Moving docstrings for two functions * Fixing two broken integration tests for covid * Using variables bound in the TDR section for validation * Moving some code within file to group liike code together * Fixed all integration tests. Reorganized location of workload methods in covid module * Fixed all integration tests. Reorganized location of workload methods in covid module * Removing extraneous description for integration tests relating to create workload method * Removing extraneous description for integration tests relating to create workload method * linted * Converted verification methods to multimethods dispatched off name * Updated name of validate source, executor, sink functions * Fixed merge conflicts and linted * removing unnecessary changeset * reverted spec.clj to develop branch. Also cleaned up some cruft * reverted spec.clj to develop branch. Also cleaned up some cruft * Making get-workspace-json function private and adding a public function which checks the workload. Also cleaned up the create verification methods and removed some unnecessary variables. Also moved source/sink/executor verification functions to the source/sink/executor sections * Adding default implementations for the create validation functions * linted * fixing firecloud method and tests * Fixing merge conflicts with develop branch * Adding flags for skipping validation of source, executor and sink in requests (#404) * Adding flags for skipping validation to source, executor and sink * linted * Fixing merge conflict Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * linting fixes * linted and fixes some nits * Fixing nits * Fixing nits * lint fixes Co-authored-by: Rex <rexwangcc@gmail.com> Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1316 Create submission within TerraExecutor update (#406) GH-1316 Create submission within TerraExecutor update Fully implemented `create-submission!` stub, pulling in existing `update-method-configuration!`. Simplified code which previously passed around updated executor with incremented method configuration version. Docstrings converted to use backticked parameter names to preserve camel case. Expanded integration testing. * [GH-1323] to-edn (#409) RR: https://broadinstitute.atlassian.net/browse/GH-1323 https://broadinstitute.atlassian.net/browse/GH-1321 Added util/to-edn a general :type dispatch function to coerce an object into a user-friendly EDN view that gets returned to the user. Allow terra executor to update details table with workflow IDs between update calls Introduce `done?` to test when the workload should be marked `:finished` Fetch workflow inputs/outputs from firecloud and transform them into a more workable form Fix jdbc double exaludations * GH-1298: TerraWorkspaceSink - verify target entity columns exist (#410) * Worried that :missing will get swamped by :attributes though. * [GH-1327] Fix Start After Stop (#411) [GH-1327] Fix Start After Stop In this change I'm fixing the strange behaviour were we can start a covid workload after it was stopped. To do this, I've made it illegal to stop a workload before it's been started. I've done this because stop before start seems strange to me too. The purpose of stop was to stop getting new data from the source but allow all data in the workload pipeline to get flushed (ie. consume all created snapshots and write all worklfows back). Stop was not meant for aborting a workload and cancelling/aborting all active prossessing. If there is a use case for such a thing (and i think there is) - I think we should expose a new endpoint called `POST /api/v1/abort` or something. Of course this PR is pretty small so if the team feels strongly about stop before start I can find another solution/ * [GH-1327] update system tests after /stop changes (#412) [GH-1327] update system tests after /stop changes removing verification that we can stop before starting and no workflows are run, * [GH-1333] Fix NullPointerException in TDR Source (#418) RR: https://broadinstitute.atlassian.net/browse/GH-1333 Caused by database changes not propagated to source code. Destructing old keys for the source lead to null values. * [GH-1336] Fix continual snapshot creation (#419) Our BigQuery query was wong as we were simply calling `toString` on a java.sql.Timestamp. This lead to intervals like `['2021-05-26 15:31:18.287641' '2021-05-26T20:00:13']` The space was causing the problem and BigQuery interpreted that as a date. The fix is to ensure the lower and upper bound of our query-between interval is of the same type by coercing java.sql.Timestamp `last_checked` into a java.time.OffsetDateTime. * Bump browserslist from 4.13.0 to 4.16.6 in /ui Bumps [browserslist](https://github.com/browserslist/browserslist) from 4.13.0 to 4.16.6. - [Release notes](https://github.com/browserslist/browserslist/releases) - [Changelog](https://github.com/browserslist/browserslist/blob/main/CHANGELOG.md) - [Commits](https://github.com/browserslist/browserslist/compare/4.13.0...4.16.6) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1337] fix create snapshot job always failing RR: https://broadinstitute.atlassian.net/browse/GH-1337 Caused by destructuring the tdr job response with the wrong key. * [GH-1338] Fix wfl not discovering new dataset rows RR: https://broadinstitute.atlassian.net/browse/GH-1338 This isn't a complete fix sadly but it at least solves the problem for low ingest rate projects like COVID. For now, only updating `last_checked` when we successfully found new rows in the dataset. * [GH-1339] Prevent Snapshot Job IDs from being clobbered with nils RR: https://broadinstitute.atlassian.net/browse/GH-1339 When we update the status of snapshot creation jobs, we only get the job metadata when the job is complete.As a consequence, `nil`s are propagated up the call stack and the database record is clobbered with nils, including the `snapshot_creation_job_id` column. To fix this, always return the metadata and log the failure should one occur. * bump version in develop to 0.8.0 (#413) * [GH-1331] document `/stop` and `/workflows` (#417) [GH-1331] Document `/stop` and `/workflows` endpoints RR: https://broadinstitute.atlassian.net/browse/GH-1331 Updating documentation for changes in v0.7.0: - workload-response no longer returning workflows - `GET /api/v1/workload/{uuid}/workflows` - `POST /api/v1/stop` Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * Add test-covid-workload to parallel test group (#424) … and fix method configuration namespace. * [GH-1341] Fix accepting malformed labels in workload request RR: https://broadinstitute.atlassian.net/browse/GH-1341 Enforce labels are of the form "name:value" where - name starts with a letter followed by any combination of letters, numbers, underscores and dashes - value is any non-blank string not containing `:` I had to include labels in the batch workload request as the coercion layer got confused when you gave it bad labels. * [GH-1344] Enforce valid email addresses in workload `watchers` * [GH-1348] Remove :pipeline from COVID workload request (#426) RR: https://broadinstitute.atlassian.net/browse/GH-1348 The :pipeline attribute of the workload request is meaningless for COVID-type workloads. Note that these workloads aren't even specific to COVID-19 processing, rather a generalisation. In this PR, I've removed the :pipeline attribute. To do this, all workload operations on maps with a nil :pipeline will be implemented by the functions in the covid namespace. * [GH-1350] fix Terra Workspace sink validation (#427) RR: https://broadinstitute.atlassian.net/browse/GH-1350 The "Terra Workspace" sink validation did not validate that the attributes listed in fromOutputs exist in the specified entity type in the workspace. This PR fixes this. * Bump dns-packet from 1.3.1 to 1.3.4 in /ui Bumps [dns-packet](https://github.com/mafintosh/dns-packet) from 1.3.1 to 1.3.4. - [Release notes](https://github.com/mafintosh/dns-packet/releases) - [Changelog](https://github.com/mafintosh/dns-packet/blob/master/CHANGELOG.md) - [Commits](https://github.com/mafintosh/dns-packet/compare/v1.3.1...v1.3.4) Signed-off-by: dependabot[bot] <support@github.com> * [GH-1324] Get or import snapshot (#422) Within rawls namespace: - get-snapshot-references: lazily paginate through all snapshot references in a workspace. - create-or-get-snapshot-reference: if importing the snapshot fails with a 409, find the first existing snapshot reference matching the snapshot id. * [GH-1354] Enforce `snapshotReaders` are email addresses (#428) RR: https://broadinstitute.atlassian.net/browse/GH-1354 This guards against erroring while creating snapshots if a user makes a typo. * [GH-1356] Fix Coercion Failure for TDR Snapshots Source (#429) RR: https://broadinstitute.atlassian.net/browse/GH-1356 Caused by https://github.com/metosin/reitit/issues/494 Fix as per https://broadinstitute.atlassian.net/browse/GH-1348 Included integration test for reitit coercions to catch this sooner. * [GH-1357] Fix failure to sink workflow outputs (#431) RR: https://broadinstitute.atlassian.net/browse/GH-1357 I assumed (wrongly) the way Rawls represented workflows in a submission was the same as how cromwell represented workflows. I also mocked these representations incorrectly in tests. To fix this, only use the submission to fetch workflow IDs, then use firecloud's workflow and workflow/outputs endpoints so we don't have to adapt/mock multiple data models. * [GH-1359] Overwrite Workspace Entity On Sink (#432) RR: https://broadinstitute.atlassian.net/browse/GH-1359 When sinking a resubmitted workflow, the TerraWorkspaceSink would append to the existing entity in the workspace. This leaves the entity in an even more wrong state than the reason the sample was reanalysed. Our fix (as agreed with cloreth) is to clobber the entity in the workspace by deleting the entity if it exists then upserting the new one. * Fix broken requirement install command in docs readme (#434) * Move docstring and document when true. (#436) The PR check fails in the test that Ed excludes on another branch. * [GH-1363] Fix Sporadic Firecloud Test Failures (#437) RR: https://broadinstitute.atlassian.net/browse/GH-1363 Caused by firecloud returning "Launching" as a workflow status - we were testing that the status was in #{"Queued" "Submitted"}. Fix util/poll so logical FALSE can be returned from the action. * Bump ws from 6.2.1 to 6.2.2 in /ui (#433) Bumps [ws](https://github.com/websockets/ws) from 6.2.1 to 6.2.2. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/commits) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * GH-1375: Fix AoU nightly test: GET /api/v1/workload fails with spec error from invalid creator (#442) * Don't spec the batch creator as an email address. * update changelog for v0.7.0 (#441) * GH-1332 wfl.tools.workloads/when-done should not exit prematurely (#438) If checking that all workloads are finished, also check that the workload list is nonempty. Fixed incorrect reference to static workload object from workload creation: we instead want to examine the latest version of the workload. * update changelog for v0.7.1 (#445) * Removing the arrays module and the corresponding integration test (#446) * Removing the arrays module and the corresponding integraiton test * Removed references to the deleted arrays module. Also ran all of the integration tests and verified that they pass * Updating the doc for the aou arrays module to have the appropriate name Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * [GH-1376] Moved generic pipeline processing interfaces from covid -> stage ns (#444) * [GH-1388] Remove `pipeline-versions` from `GET /version` RR: https://broadinstitute.atlassian.net/browse/GH-1388 Inline the `version` key and add a `spec` for the response from the endpoint. Added system tests. * [GH-1389] Delete `^:excluded` arrays system tests RR: https://broadinstitute.atlassian.net/browse/GH-1389 These tests were missed in https://github.com/broadinstitute/wfl/pull/446 * Resolving NPM security vulnerabilities (#448) * Fixing dependabot reported security issues as well as build reported npm audit security issues * Fixing lib for is-glob. Part of UI npm packages * Some updates. Some left to do * adding package.lock * Fix by using npm update. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Rex <rexwangcc@gmail.com> * GH-1371: Lint more. (#439) * Reduce noise while kibit'zing. * Add the Eastwood cLinter. * cLint Eastwood api/src and dampen noise. * Lint tests too. * Squash bugs and squelch noise. * Fix tests now that they "work". * nil-if-empty -=> not-empty * Disable some eastwood warnings. * :staus -=> :status * Suppress more warnings. * Expand namespace aliases in keywords. * Update Kondo. * LINT -=> FORMAT * Continue on LINT errors. * Checkpoint some kondo advice. * dotimes -=> repeatedly * Revert "Expand namespace aliases in keywords." * Build UserException to support Eastwood. * repeatedly -=> dotimes * Restore code deleted for debugging. * Work around: unused-fn-args: Function arg p2__11676# never used * Maybe explain foldl better. * [GH-1294] Migrate to use datasets in prod TDR (#453) * Update default values. * Use a dataset in production for integration tests. * Lint. * Remove unused ENV VAR WFL_TDR_SA. * Address comments. * Add sleep to avoid the transient 404s from TDR. * GH-1346 Document workload Executor and Terra implementation (#451) * [GH-1345] Document the workload `Source` and Implementations (#450) RR: https://broadinstitute.atlassian.net/browse/GH-1345 Implementations include: - `Terra DataRepo` Source - `TDR Snapshots` Source * Clean up my testing code... (#456) * Phoned a friend @tbl3rd to fix return type hint in executor docs (#457) * [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation (#449) [GH-1347] Document the workload `Sink` and `Terra Workspace` sink implementation RR: https://broadinstitute.atlassian.net/browse/GH-1347 * Move `source` and `sink` Interfaces and Implementations into new namespace (#454) * [GH-1334] Move `source` Interface and Implementation into new namespace RR: https://broadinstitute.atlassian.net/browse/GH-1334 Start to move tests too. * [GH-1335] Move `Executor` Code to new Namespace RR: https://broadinstitute.atlassian.net/browse/GH-1335 Doc-strings from https://github.com/broadinstitute/wfl/pull/451 * [GH-1408] Don't modify source code on build Generate UserException on prebuild Make linters depend on prebuild. They're run as part of the PR action and can be run in parallel locally. Remove formatting target - use deps for that. * [GH-1393]: Add `/retry` Endpoint (#452) * [GH-1393]: Add `/retry` Endpoint (#452) RR: https://broadinstitute.atlassian.net/browse/GH-1393 Add `POST /api/v1/workload/{uuid}/retry` route, taking the status as the request json body. Add `retry` multimethod to the workload interface. This operation takes the workload as well as as list of workflows to retry. Add `workflows-by-status` multimethod to take a status to filter by. No workloads support `retry` as yet so all implementations throw a 501. Added system and integration tests to lock down this behaviour. * Add test for [GH-1385] RR: https://broadinstitute.atlassian.net/browse/GH-1385 * fix build failures in develop (#458) * Documenting a Workload (#430) * rebasing off of newest develop. Adding initial documentation for documentation of a workload. * Adding covid workload module * adding -m flag to python pip command * Added covid-module to the mkdocs.yaml Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Olivia Kotsopoulos <okotsopo@broadinstitute.org> * Changing order of executor block in docs (#459) Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * GH-1326 Move Sink code to new Namespace (#463) * Split specs into their specific modules (#460) * GH-1289 split out covid specs * GH-1289 moved more of the module specific specs to the modules * couple of fixes * move tdr source to source.clj * [GH-1398] Fix `test-start-aou-workload` from polling forever (#462) RR: https://broadinstitute.atlassian.net/browse/GH-1398 Split `workloads/when-done` into two functions: - one that polls until the workload is :finished - another that polls all the workflows Add an assertion to the automation-test * [GH-1401] Add `retry` Attribute to TerraExecutorDetails Type and Instances Thereof (#461) * add lint to the check target * add liquibase changelog to add `retry` to terraexecutordetails * don't return retried workflows form the executor * add a long comment about returning workflows * Fix some namespace issues (#464) * [GH-1409] Add Status Query String to workflows GET request (#466) * GH-1409 Add Status Query String to workflows GET request * Bring cromwell statuses up to date * add test for workflows by status endpoint * [GH-1409] fix some linting errors * Add On Hold back in, trim slashes for endpoints * change to not= * split statuses into rawls workspace * linters hate spaces * Minor changes * Remove trim-slashes and fix endpoints in wfl.tools namespace (#469) * [GH-1411] Describe Workflow Outputs (#471) RR: https://broadinstitute.atlassian.net/browse/GH-1411 Inform the Sink downstream of the executor of the workflow type. When sinking outputs to the TDR, you need to ingest the file outputs followed by the metadata with the files replaced by their fileref URIs. Thus, we need a way of determining if a particular output is a File that needs ingesting or just a metadata String that happens to be a gs url. I've had some luck in the past using womtool to describe the WDL and then walk the description of the outputs, dispatching to the relevant handlers (see workflows.clj in the test tools). For lack of a better alternative, I'm going to pursue using womtool to handle outputs in a generic way. Implemented by changing the source and executors to be queues of `[description value]` pairs. In the case of the TDR sources, the description is a qualified keyword describing what the object is (ie. a snapshot). For the terra executor, the description is the workflow description as reported by womtool and the value is the workflow. * Logging added to source namespace (#475) And downgraded level for several noisy logs. * GH-1414 Executor falls back to fetching existing snapshot reference (#477) * [GH-1139] Added new log namespace (#470) * Added new log namespace * Some updates to log namespace * Replace references to clojure.tools.logging with new log namespace * Remove trace logs * add logger protocol to support disabling, set up unit tests * Remove last remaining references to clojure.tools.logging and update documentation * fix linter error * Remove log4j and slf4j dependencies * Keep JSON logs out of EDN description files. * fix for logs getting into workflow edn files * Not sure how this ever worked though. * Fuss. * Simplify. * Restore docstrings. * Force build to suckseed. * Simplify. * Ensure do-or-nil returns nil. * Break up long lines. * change jsonpayload to message and remove obsolete logging tests * reverted some changes back * update docstrings and convert some str's to joins * added notice to documentation * added some documentation for looking up logs locally Co-authored-by: Tom Lyons <tbl3rd@gmail.com> * Remove error txt file (#479) * GH-1417: Remove the "skipped" nil UUID Cromwell status hack. (#393) * Remove the "skipped" nil UUID Cromwell status hack. * Patch up the rebase. * Fix rebase conflict. * Patch up rebase. * The status-counts fn is an old Zeroism. * Update docstrings. * Remove duplicate batch/workload-request spec. * Use the batch-namespaced spec. * Glean some lint that collected. * [GH-1419] Terra Executor queue length should consider workflows with null status (#481) Added new integration test. Also in existing integration tests, corrected faulty assumption in mocking: Firecloud returns differently formatted workflows depending on whether they are fetched as part of a submission fetch, or fetched individually. * GH-1397: Sweep and snapshot dataset row IDs that were missed when polling a dataset (#468) * Document Swagger validation of API specs. * Mock snapshots missing TDR row IDs. * Mock datarepo/query-table-between instead of source/find-new-rows. * Verify that rows are shared across updates. * Work around deprecation warning. * Do not call start-source! twice. * Goodby Fibonacci. * Simplify. * Add earliest and latest and clean up. * Make find-new-rows work with combine-tdr-source-details. * Run `clojure -M:format` in this directory * Glean more lint. * Should probably count rows in running snapshots too. * Respond to more Ed comments. * [GH-1139] Logging level and configuration (#480) * Logging level and configuration * unit testing for logging levels * few fixes/pr changes * use prepared statement * GH-1395: Document the new WFL retry capability for users. (#483) * The /retry endpoint is not implemented. * Address some comments from Ed. * Add a note from OK. * [GH-1394] TerraExecutor retry implementation and Sarscov2IlluminaFull integration * [GH-1395] GitHub Pages updates: COVID, staged workflows, retry functionality (#489) * Bump path-parse from 1.0.6 to 1.0.7 in /ui (#491) * Mkdocs updates that improve the doc nav and security holes. (#492) * [GH-1402] Write Workflow Outputs to the Terra Data Repository (#474) * [GH-1405] Add Schema for TerraDataRepoSink (#473) [GH-1405] Add Schema for TerraDataRepoSink RR: https://broadinstitute.atlassian.net/browse/GH-1405 Add schemas for the following types - TDRJobStatus - TDRJobType - TerraDataRepoSinkDetails and the TerraDataRepoSink table. * [GH-1314] Start Implementing DataRepo Sink (#476) RR: https://broadinstitute.atlassian.net/browse/GH-1413 Add load/create functions for the data repo sink - leaving update as "unimplemented". * [GH-1415] Implement `update-sink` for TerraDataRepoSink (#478) RR: https://broadinstitute.atlassian.net/browse/GH-1415 This change contains a rough implementation of the TerraDataRepo sink update functionality. After discovering certain hurdles in the original design, I've made some tweaks and incorporated TDR's work https://broadworkbench.atlassian.net/browse/DR-1960 so that we only need one ingest request instead of separately loading the output files and output metadata. At the time of writing, DR-1960 is work in progress so our ingests won't work just yet. Some things to call out in this change that I'm deferring to a follow up change into the feature branch: I need to upload the json file for TDR to ingest. I'm currently using our test outputs bucket as a temporary folder for this which is obviously not a production-ready solution. Possible fixes include creating a scratch bucket for workflow-launcher, creating a bucket per workload, using the executor's execution bucket... etc. Suggestions welcome. Since the ingests always fail, I'm leaving some of the work post-ingest as a TODO. * [GH-1425] Expose `TerraDataRepoSink` Via the HTTP API (#484) RR: https://broadinstitute.atlassian.net/browse/GH-1425 Add specs and register in wfl.api/spec. Add end-to-end system test for reading/writing workflow inputs/outputs to TDR (used the illumina_genotyping_array pipeline as it's smaller than sarscov2_illumina_full). Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * [GH-1416] Document DataRepo Sink (#485) RR: https://broadinstitute.atlassian.net/browse/GH-1416 Add section to `sink.md` describing how to configure workflow-launcher to write outputs back to a terra datarepo dataset. * [GH-1430] Remove `stage/validate-or-throw` and inline their implementations into `create-X`. (#490) RR: https://broadinstitute.atlassian.net/browse/GH-1430 Remove stage/validate-or-throw and inline their implementations in their respective create-X functions. This is done because - we were losing context (ie. was this a source/sink/executor) and had a collision in the tdr sink and source impls - these multimethods didn't really make sense and require some redesign work * address @okotsopoulos's feedback Co-authored-by: Chengchen(Rex) Wang <14366016+rexwangcc@users.noreply.github.com> * GH-1328 Migrate to new Rawls Snapshot V2 endpoints (#493) * ENG-1394 Update post-retry error maps per QA feedback (#494) * [GH-1303] Google Cloud Logging Alerts (#486) * Documentation for monitoring alerts and error response * Fix lint error and add md file to yml * [GH-1418] sourceLocation null fix (#487) * initial sourcelocation log fix * Update jdbc macros to include line number * Github actions should build docs on develop. (#495) * [GH-1301] Add Slack watcher support for user exceptions. (#467) * Add a Slack module. * Thanks to the linter. * Notify Slack channels UserExceptions. * Commit the consensus summary from mobbing. * Agent, WIP. * Resolve merge conflict. * Checkpoint. * Update the add-notification api. * Use a persistentqueue for notification agent. * Lint. * Commit tests. * CLean up. * Lint. * Linnnnnnt. * DB changes. * Update existing docs. * Resolve comments. * Add watcher spec. * Add watchers on all workload creations. * Remove comment blocks and fix unit tests. * Feature complete. * Attempt to fix the test. * Tests work now! * Lint. * Move deserialization to the top level. * deref should have a boundary. * Further separate the logic. * Lint. * Resolve comments. * Lint. * Bump the time out for slack integration tests. * Make watchers [SlackChannel ChannelId] | [EmailAddress EmailAddress] | EmailAddress. * Add a debugging line for slack test. * Mark the test as pending. * In fact we don't need 2 dimentional arrays for watchers. * Redo serialization and de-serialization, so it's more clear and secure. * Lint. * Thanks @tbl3rd. Address comments. * DB migration. * Thanks to the team, finally have a rough consensus. * Lint. * Update docs. * Our test caught a real issue! * Add a nightly system test against dev WFL (#498) * Make system test more flexible when kick off through make. * Add a nightly system test. * Add a nightly target for make and run it in Github Actions. * Typo. * oops. * Add a badge for WFL nightly test. (#500) * Remove the UI component of WFL. (#499) * Remove the ui dir and make module references and ci refs. * bump version of reitit. * Add back the swagger page to API. * Add back ui dir but only keep the proxy image. * [GH-1353] Fix spec error when skipValidation sent (#497) * Fix spec error when skipValidation sent * fix lint errors * revert other fix, save source with a dataset edn if skipvalidation true * [GH-1441] Stop swallowing UserExceptions (#501) * userexception initial * actually return userexceptions and log them as warnings * small changes * GH-1439 Succeeded workflow status should be permissible to retry (#503) * GH-1454 Update covid and executor integration tests to reflect updated method configuration version (#504) * GH-1450 Add retry attribute to TerraDataRepoSource type and instances (#505) * GH-1446: fix system tests again (#502) * Port changes from old review and debug branch. * 1 file(s) formatted incorrectly * Examine wedged workloads. * I dropped the database. * Commit debug hacks so line numbers agree with logs. * Move trace to flag only updating workloads. * Fix when-all-workflows-finish. * Checkpoint ingest-illumina-genotyping-array-files. * Checkpoint BQ interface. * Still: Can not delete a dataset being used by snapshots * Handle another dataset-or-snapshot case. * Work around {:defaultSnapshotId nil}. * Fix dataset fixture. * Don't fail test-workflows-by-status when no workloads. * Try using wfl-dev instead of general-dev-billing-account. * Revert "Try using wfl-dev instead of general-dev-billing-account." * Keep non-null inputs and options from "Failed" workflows. (#508) * GH-1462: Make :watchers key always optional. (#515) * Make :watchers key always optional. * Mention that the watchers field is optional in docs. * Bump AoU Arrays.wdl version to 2.4.1 to drop default (#514) genotype_concordance_threshold to 0.95. * GH-1444 TerraDataRepoSource should not snapshot more frequently than every 20 min (#513) Also back to updating last_checked any time we poll TDR, even if it did not find new rows. * Updating the version for the develop branch to 0.9.0 (#517) Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * GH-1373 rename removeSucceeded in retry runbook (#519) Since we are really keeping the failed workflows. * GH-1476 Log failed snapshot creation job's metadata and result (#518) Gracefully handle when datarepo/job-result throws ... which happens if the snapshot creation job failed due to dataset locking, for ex. Add unit tests for new function source/result-or-catch. * GH-1517 log error instead of throw on TerraExecutor method config version mismatch (#527) * [GH-1517] Add automated test for updating method configurations with mismatched versions (#528) * Merge main into develop (#533) * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Release v0.6.1 Patch (#365) * [GH-1284] Fix docker image generation (#363) The `Dockerfile` is being invoked in the root directory - we don't need to add the parent directory, just the current one. * [GH-1277] Prevent update loop from terminating (#364) Catch all throwables to make sure a workload failure doesn't bring down workflow-launcher. * update changelog + bump patch to v0.6.1 * release 0.7.0 - SARSCov2 Illumina Full support In Terra (#440) * Snapshot creation uses datetime rather than date (#355) * Snapshot creation uses datetime rather than date * Making the name of the column used for datetime interval variable Co-authored-by: rhiananthony <ranthony@broadinstitute.org> * bump develop to 0.7.0 (#360) * util/do-or-nil-silently should move to build.clj (#367) * util/do-or-nil-silently should move to build.clj * Fixed indentation to match standard * GH-1226 Support entity set creation when creating submissions. (#353) * First pass: create-submissions generates one submission per entity. * Cleaned up TODOs for first pass attempt, reformatted tests to pass lint step in build. * Second pass: support entity set creation when specifying >1 entity for a submission. Refactored bigquery table dump to pull out now-common code. * Address PR feedback To simplify, will generate an entity set even for the singleton input. Removed firecloud/consolidate-entities-to-set as a result. Need to pass in 'expression' to submission creation payload when specifying an entity set. Supplemented integration and unit tests. * Create an interface all COVID work can build on top of. (#369) * Create an interface all hornets work on top of. * Apply suggestions from code review * Update readme.md Build Boad. * GH-1272: Lists are not imported correctly into a Terra Workspace Entity (#368) * Boad -=> Board * Still doesn't round-trip on doubles and values like "19A". * Unit test wfl.tsv. * Ensure TSV mappability. * Ensure TSV is tabulatable. * Fix bug in assert-mapulatable!. * Add wfl.service/rawls.clj for interacting directly with Rawls API (#370) * Rename rawls/create-snapshot -> create-snapshot-reference (#373) We create the snapshot in the TDR, and link to it in the workspace via Rawls. * [GH-1215] Add snapshot_reference_id to Sarscov2IlluminaFull table schema (#375) * Add snapshot_reference_id to Sarscov2IlluminaFull table schema Because schema update hasn't run, altering existing table definition in place. * Documentation changes from local Postgres debugging, onboarding - Clarified order of operations when installing, using local Postgres - Fixed broken intra-doc links - Added docs readme with instruction for launching local documentation site * Add instruction for recreating wfl DB to development docs * Remove references to undefined documentation files * [GH-1293] fix test-create-submissions-for-entity-set (#378) Simply getting the submission again is not sufficient to guarantee that the workflow has been queued for execution. Poll instead. * Update documentation and infrastructure for release candidates (#377) * Update documentation and infrastructure for release candidates - creating release candiates - bashing release candidates Restore version override in cli.py. Re-tag latest docker images in cli.py so that version can be overwritten. Update IMAGES target in Makefile to add the :latest tag. Clean up :latest images on distclean target. * fix whitespace * Bump ssri from 6.0.1 to 6.0.2 in /ui (#382) Bumps [ssri](https://github.com/npm/ssri) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/npm/ssri/releases) - [Changelog](https://github.com/npm/ssri/blob/v6.0.2/CHANGELOG.md) - [Commits](https://github.com/npm/ssri/compare/v6.0.1...v6.0.2) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add interactive script ahead of covid-19 demo (#380) * add interactive script ahead of covid-19 demo tomorrow * fix rawls test failure * reformat test * share demo remources with cdc-covid-surveillance firecloud group * corrections from review * format for consistency * clone a dev workspace instead of the prod one * fix test-import-snapshot - workspace changed * fix arrays test failures * [GH-1213] Update DB schema for COVID workload creation (#381) * changelog changes * Link workload with source, sink and executor tables. * Update the p-keys to bigint. * SQL has reserved table. * Initial creat covid workload commit * Verifies dataset as well * Update DB schema. * Address comments. * v0.6.0 Release (#359) * [GH-1278] Fix SG update-workload! implementation from updating Clio multiple times (#358) [GH-1278] Fix SG update-workload! implementation from trying to update the clio BAM records for each Succeeded workflow repeatedly by restoring the :finished guard for register-workload-in-clio. Introducted in https://broadinstitute.atlassian.net/browse/GH-1277. * GH-1188: Bump GDCWholeGenomeSomaticSingleSample Version When Lantern Release (#351) * Restore inputs processing hacks. * GDCWholeGenomeSomaticSingleSample moved. * Run off of /develop/ branch instead. * clojure -M:format * Use the GDCWholeGenomeSomaticSingleSample_v1.1.0 release. * Remove the Rich Comment. * GitHub `develop` infrastructure + release changes (#356) Update test actions to only run on PRs. Update release action to only run on new commits into main. Update + format docs about branching off develop and releasing into main. Restore tag-and-push-images in cli.py * GH-1282: Document WFL's support for Somatic Genomes. (#361) * Draft doc module for SG. * Update navigation Camel^H^H^H^H^HYAML. * Update CHANGELOG.md for v0.6.0, including previous patches Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Resolve merge conflicts. * Fix issues generated from rebasing. * Fix the failing integration tests. Co-authored-by: rhiananthony <ranthony@broadinstitute.org> Co-authored-by: Edmund Higham <ehigham@users.noreply.github.com> Co-authored-by: Tom Lyons <tbl3rd@gmail.com> Co-authored-by: Tom Lyons <tbl@broadinstitute.org> * Add multimethods for source, executor and sink operations (#383) * Add multimethods for source, executor and sink operations Add skeleton implementations for tdr-source * make use of `!` more consistent? * add create operations for source, executor and sink * [GH-1215] Import snapshots within COVID workload (#376) - Data model change: a snapshot_reference_id is linked to a workspace and should be associated with the executor, not the source (TDR in our use case) - Added covid/get-imported-snapshot-reference - nil or snapshot reference from Rawls for snapshot_reference_id in executor details instance (to be created from TerraExecutorDetails) - Added covid/import-snapshot! - import snapshot to workspace, writing to DB if successful - Added integration tests (incomplete coverage) * [GH-1295] Use Rawls to Import Workflow Outputs (#384) RR: https://broadinstitute.atlassian.net/browse/GH-1295 firecloud's flexibleImportEntities has a size limit on the TSV file you POST. Unfortunately, one sarscov2_illumina_full workflow's outputs exceeds this. We can work around this issue by using Rawl's batchUpsert. This is slightly different as it takes a list of operations on how to construct the entity rather than the entity serialised to TSV. In this PR, I've demonstrated how we can use Rawls to import the workflow's outputs into the workspace. I've also updated the demo to do this for a workflow that has already passed. * GH-1285: [COVID] Launch submissions through Rawls, and keep track of them (#372) * Simplify day intervals. * Document wfl.tools.snapshots. * Map name over keyword arguments. * Force the production Rawls. * Checkpoint groups getters. * Patch rebase conflict. * Implement start-covid-workload! kinda. * Checkpoint start-covid-workload! test. * Add covid-workload-request to support unit. * clojure -M:format * Move Rich comment into a unit test. * Respond to comments and tidy up a little * Remove unit test now covered by integration. * Document clojure.test/test-vars to remind me. * Bump url-parse from 1.4.7 to 1.5.1 in /ui (#388) Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.7 to 1.5.1. - [Release notes](https://github.com/unshiftio/url-parse/releases) - [Commits](https://github.com/unshiftio/url-parse/compare/1.4.7...1.5.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update db schema again based on lasted discussions. (#387) * Update db schemas. * Make the names more clear. * Address comments. * Switch to SQL syntax from XML. * Address more comments. * one more update. * Bump y18n from 4.0.0 to 4.0.1 in /ui (#352) Bumps [y18n](https://github.com/yargs/y18n) from 4.0.0 to 4.0.1. - [Release notes](https://github.com/yargs/y18n/releases) - [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md) - [Commits](https://github.com/yargs/y18n/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1301] Notify Workload Watchers when "User Visible" Exceptions Occur (#386) RR: https://broadinstitute.atlassian.net/browse/GH-1301 Introduce wfl.util.UserVisibleException - a new type of exception that we should handle to throw and handle errors that users are meant to see. Re-wire the background loop to handle UserVisibleException and also be more exception safe. Start to plumb in email notifications. * Bump hosted-git-info from 2.8.8 to 2.8.9 in /ui (#392) Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9. - [Release notes](https://github.com/npm/hosted-git-info/releases) - [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md) - [Commits](https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump lodash from 4.17.19 to 4.17.21 in /ui (#390) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.19 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.19...4.17.21) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [GH-1311]: Create/Load COVID workloads (#389) RR: https://broadinstitute.atlassian.net/browse/GH-1311 Add suppor…
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
RR: https://broadinstitute.atlassian.net/browse/GH-1354
This guards against erroring while creating snapshots if a user
makes a typo.