Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GH-1623] workloads/update-workload! should define own transactions #597

Merged
merged 3 commits into from
Mar 30, 2022

Conversation

okotsopoulos
Copy link
Contributor

@okotsopoulos okotsopoulos commented Mar 24, 2022

Purpose

As a prerequisite to factoring out external calls from DB transactions, this PR forces workloads/update-workload! implementations to define their own transactions.

With these changes, this decoupling is complete for staged workloads. Staged workloads can be updated in their own future in a separate PR.

Legacy workload update behavior otherwise remains the same.

System Tests

Pass:

wm111-e35:wfl okotsopo$ make TARGET=system
export CPCACHE=/Users/okotsopo/wfl/api/.cpcache;            \
	export WFL_WFL_URL=http://localhost:3000; \
	clojure  -M:parallel-test wfl.system.v1-endpoint-test | \
	tee /Users/okotsopo/wfl/derived/api/system.log
WARNING: Use of :paths external to the project has been deprecated, please remove: ../derived/api/src
WARNING: Use of :paths external to the project has been deprecated, please remove: ../derived/api/resources

Ran 33 tests containing 374 assertions.
0 failures, 0 errors.

Review Instructions

I recommend hiding whitespace when reviewing. Wouldn't mind eyes on the code consolidation I did in wfl.server.clj.

@okotsopoulos okotsopoulos marked this pull request as ready for review March 24, 2022 21:16
Copy link
Contributor

@tbl3rd tbl3rd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in a place where I can run this now.
I worry about fragmenting the update transaction.
Can overlapping updates on a single workload interfere?

(patch-workload tx workload {:updated now})
(when (every? stage/done? [source executor sink])
(patch-workload tx workload {:finished now}))
(workloads/load-workload-for-id tx id)))]
(if (and started (not finished)) (update! workload (utc-now)) workload)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't new in this PR but noticed that update-sink! and patch-workload run in separate TXs?
I guess workload state will converge. (Batch updates are simpler here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all of the individual stages defined transactions as needed to avoid external calls within open transactions.

But yes, there are thread safety concerns where retries are concerned:

https://broadinstitute.atlassian.net/browse/GH-1423

(catch Throwable t
(log/error "Failed to update workload"
:workload (workloads/to-log workload)
:workload (workloads/to-log workload-record)
:throwable t))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the initial load and the update run in separate TXs.
Would it be safer to run the load in the update's transaction?
(Not sure how those overlap in the schema.)

I prefer a separate backstop try to the nesting, but that looks correct.
I like the workload-record distinction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good points all around.

I will see about punting the workload load to the update transactions. Originally I thought I needed the loaded workload to Slack on UserExceptions, but the workload record is sufficient.

I was on the fence about consolidating the backstops anyways (something about try-within-a-try is hard to track) so will un-nest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can manage try-within-try on a good day, but not every day.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workload loading now takes place in the update's transactions, and try-within-a-try is no more.

- Load workload within workloads/update-workload! implementation transactions
- Unnest the try-within-a-try in server/try-update
(load-workload))]
(if (and started (not finished))
(update! (load-workload))
(load-workload)))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@okotsopoulos okotsopoulos merged commit d3d3f16 into develop Mar 30, 2022
@okotsopoulos okotsopoulos deleted the okotsopo/GH-1623-update-workload-defines-tx branch March 30, 2022 20:00
okotsopoulos added a commit that referenced this pull request Apr 1, 2022
GH-1623 workloads/update-workload! should define own transactions (#597)
GH-1640 Broken Terra Submission links in completed workflow Slack messages (#598)
GH-1639 Pin jinja2 version for mkdocks compatibility (#596)
GH-1433 No TDR temporary snapshots in tests (#595)
GH-1619 Add workload info to sink logs (#594)
GH-1638 Disable link unfurling when Slacking watchers (#593)
okotsopoulos added a commit that referenced this pull request Apr 4, 2022
GH-1623 workloads/update-workload! should define own transactions (#597)
GH-1640 Broken Terra Submission links in completed workflow Slack messages (#598)
GH-1639 Pin jinja2 version for mkdocks compatibility (#596)
GH-1433 No TDR temporary snapshots in tests (#595)
GH-1619 Add workload info to sink logs (#594)
GH-1638 Disable link unfurling when Slacking watchers (#593)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants