Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal implementation of base restart workchain #3748

Merged
merged 4 commits into from
Feb 11, 2020

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Feb 10, 2020

Fixes #3644

This is an alternative PR to #3654 that removes two features that are still contentious:

  • Registering additional process handlers per workchain implementation through entry points
  • Registering additional process handlers per workchain instance launched through inputs

The other PR contained a first implementation of these features but since the provenance is easily lost when using these constructs, we postpone integration until later when we have had more time to investigate them. Since they will be purely additive features they can be added later without affecting the existing interface.

@sphuber sphuber force-pushed the fix_3644_minimal_base_restart branch from 62693d6 to 29b9836 Compare February 10, 2020 13:53
@yakutovicha yakutovicha self-requested a review February 10, 2020 16:29
The calculation job class used exit codes in the range `100 - 199` which
are reserved for scheduler errors. They have been changed to be in the
three hundred range.
@sphuber sphuber force-pushed the fix_3644_minimal_base_restart branch from 29b9836 to 8472e51 Compare February 10, 2020 17:58
Copy link
Contributor

@yakutovicha yakutovicha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sphuber.

I tested this functionality in the raspa plugin (lsmo-epfl/aiida-raspa#56) - works fine.

I also left some minor comments, suggestions.

.ci/workchains.py Outdated Show resolved Hide resolved
.ci/workchains.py Outdated Show resolved Hide resolved
.github/workflows/setup.sh Outdated Show resolved Hide resolved
aiida/engine/processes/workchains/restart.py Show resolved Hide resolved
aiida/engine/processes/workchains/restart.py Outdated Show resolved Hide resolved
aiida/engine/processes/workchains/restart.py Outdated Show resolved Hide resolved
aiida/engine/processes/workchains/restart.py Outdated Show resolved Hide resolved
aiida/engine/processes/workchains/restart.py Outdated Show resolved Hide resolved
aiida/engine/processes/workchains/utils.py Show resolved Hide resolved
aiida/engine/processes/workchains/utils.py Outdated Show resolved Hide resolved
The `BaseRestartWorkChain` is designed to help the writing of a base
workchain that wraps the launching of a sub process, for example a
calculation job, and provides a framework for easily adding automated
error handling and performing sanity checks.

The class and its utilities were originally implemented in the
`aiida-quantumespresso` plugin but were designed from the get-go to be
generically applicable. It was quickly adopted and used in other
plugins. Therefore we are now moving it to `aiida-core` so that it can
be useful to many plugins without separate copies of the code having to
be maintained. Originally it was only designed to wrap around
calculation jobs, but the concept is generic enough that here it is
generalized to any sub process, such as `WorkChains`.

The basic concept is simple, instead of subclassing from `WorkChain` one
subclasses the `BaseRestartWorkChain` and uses at a minimum the outline:

   cls.setup
   while_(cls.should_run_process)(
       cls.run_process,
       cls.inspect_process,
   ),
   cls.results

The logic of those outline methods is implemented on the base class so
the only thing that remains is to specify the process sub class that
needs to be used. The sub process will be launched until it is sucessful,
i.e. exit status 0, or the maximum number of iterations is exceeded. The
`inspect_process` will loop over the list of registered process handlers.
The handlers can check for errors in case that the sub process failed, or
perform sanity checks even if the sub process was successful. The
handlers can return a `ProcessHandlerReport` to control the further flow,
for example by breaking out of the process handler call loop or even
completely aborting the workchain if an unrecoverable problem was
encountered.
To clarify the `register_process_handler` decorator, the `priority`
argument is made a keyword-only argument such that callers are
forced to specify it. Having just an integer in the method arguments
can be confusing as to its meaning.

This commit also adds type checking to the `priority` keyword.
The `exit_codes` argument takes a single or list of `ExitCode`
instances. If defined, the handler will return `None` if the exit
code set on the `node` does not appear in the `exit_codes`. This is
useful to have a handler called only when the process failed with a
specific (set of) exit code(s).
Copy link
Contributor

@yakutovicha yakutovicha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sphuber! For me, this is good to go.

@sphuber sphuber merged commit 1ba87c0 into aiidateam:develop Feb 11, 2020
@sphuber sphuber deleted the fix_3644_minimal_base_restart branch February 11, 2020 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move BaseRestartWorkChain to aiida-core
2 participants