♻️ Allow for file uploads/downloads to be async #6079
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note, this PR is currently dependent on aiidateam/plumpy#272
Currently, a possible bottleneck for workers (running potentially 1000s of processes asynchronously) is the upload/retrieval of calculation input/output data from an external compute resource (e.g. HPC).
This runs in a blocking manner, i.e. all other async tasks have to wait until all the input or outputs are fully uploaded/retrieved.
This could be made asynchronous, either at the "file level" - relinquishing control to the event loop after each file upload/download, or even at the "byte level" - relinquishing control after each "chunk of a file" has been uploaded/downloaded.
(For other transports, like FirecREST there are even other aspects of async to consider.)
This particular PR does not actually implement any async behaviour for uploads/downloads, it merely modifies the engine API to allow for implementations of the
Transport
interface to achieve this.The PR changes the following functions/methods to async:
execmanager.upload_calculation
execmanager.retrieve_calculation
execmanager.retrieve_files_from_list
Calcjob.run
CalcJob._perform_dry_run
CalcJob._perform_import
However, all of these are not intended for use by the user, hence I would suggest this is backwards compatible.