Skip to content

Implement targeted "overrides" of requirements on specific tools #440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Nov 21, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ pydocstyle_report.txt: $(PYSOURCES)
pydocstyle setup.py $^ > pydocstyle_report.txt 2>&1 || true

diff_pydocstyle_report: pydocstyle_report.txt
diff-quality --violations=pep8 $^
diff-quality --violations=pycodestyle $^

## autopep8 : fix most Python code indentation and formatting
autopep8: $(PYSOURCES)
Expand Down
74 changes: 59 additions & 15 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,21 @@ and ``--tmp-outdir-prefix`` to somewhere under ``/Users``::
.. |Build Status| image:: https://ci.commonwl.org/buildStatus/icon?job=cwltool-conformance
:target: https://ci.commonwl.org/job/cwltool-conformance/

Running user-space implementations of Docker
--------------------------------------------

Some compute environments disallow user-space installation of Docker due to incompatiblities in libraries or to meet security requirements. The CWL reference supports using a user space implementation with the `--user-space-docker-cmd` option.

Example using `dx-docker` (https://wiki.dnanexus.com/Developer-Tutorials/Using-Docker-Images):

For use on Linux, install the DNAnexus toolkit (see https://wiki.dnanexus.com/Downloads for instructions).

Run `cwltool` just as you normally would, but with the new option, e.g. from the conformance tests:

.. code:: bash

cwltool --user-space-docker-cmd=dx-docker --outdir=/tmp/tmpidytmp v1.0/test-cwl-out2.cwl v1.0/empty.json

Tool or workflow loading from remote or local locations
-------------------------------------------------------

Expand Down Expand Up @@ -369,6 +384,50 @@ at the following links:
- `Specifications - Implementation <https://github.com/galaxyproject/galaxy/commit/81d71d2e740ee07754785306e4448f8425f890bc>`__
- `Initial cwltool Integration Pull Request <https://github.com/common-workflow-language/cwltool/pull/214>`__

Overriding workflow requirements at load time
---------------------------------------------

Sometimes a workflow needs additional requirements to run in a particular
environment or with a particular dataset. To avoid the need to modify the
underlying workflow, cwltool supports requirement "overrides".

The format of the "overrides" object is a mapping of item identifier (workflow,
workflow step, or command line tool) followed by a list of ProcessRequirements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example of an identifier for a workflow step?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some discussion. Also made this more complex (but hopefully more useful) by resolving workflow ids relative to the workflow, and everything else relative to the job/--overrides document.

that should be applied.

.. code:: yaml

cwltool:overrides:
echo.cwl:
- class: EnvVarRequirement
envDef:
MESSAGE: override_value


Overrides can be specified either on the command line, or as part of the job
input document. Workflow steps are identified using the name of the workflow
file followed by the step name as a document fragment identifier "#id".
Override identifiers are relative to the toplevel workflow document.

.. code:: bash

cwltool --overrides overrides.yml my-tool.cwl my-job.yml

.. code:: yaml

input_parameter1: value1
input_parameter2: value2
cwltool:overrides:
workflow.cwl#step1:
- class: EnvVarRequirement
envDef:
MESSAGE: override_value

.. code:: bash

cwltool my-tool.cwl my-job-with-overrides.yml


CWL Tool Control Flow
---------------------

Expand Down Expand Up @@ -500,18 +559,3 @@ logger_handler
logging.Handler

Handler object for logging.

Running user-space implementations of Docker
--------------------------------------------

Some compute environments disallow user-space installation of Docker due to incompatiblities in libraries or to meet security requirements. The CWL reference supports using a user space implementation with the `--user-space-docker-cmd` option.

Example using `dx-docker` (https://wiki.dnanexus.com/Developer-Tutorials/Using-Docker-Images):

For use on Linux, install the DNAnexus toolkit (see https://wiki.dnanexus.com/Downloads for instructions).

Run `cwltool` just as you normally would, but with the new option, e.g. from the conformance tests:

```
cwltool --user-space-docker-cmd=dx-docker --outdir=/tmp/tmpidytmp v1.0/test-cwl-out2.cwl v1.0/empty.json
```
101 changes: 79 additions & 22 deletions cwltool/load_tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@
import uuid
import hashlib
import json
from typing import Any, Callable, Dict, List, Text, Tuple, Union, cast
import copy
from typing import Any, Callable, Dict, List, Text, Tuple, Union, cast, Iterable

import requests.sessions
from six import itervalues, string_types
Expand All @@ -23,19 +24,65 @@

from . import process, update
from .errors import WorkflowException
from .process import Process, shortname
from .process import Process, shortname, get_schema
from .update import ALLUPDATES

_logger = logging.getLogger("cwltool")

jobloaderctx = {
u"cwl": "https://w3id.org/cwl/cwl#",
u"cwltool": "http://commonwl.org/cwltool#",
u"path": {u"@type": u"@id"},
u"location": {u"@type": u"@id"},
u"format": {u"@type": u"@id"},
u"id": u"@id"
}


overrides_ctx = {
u"overrideTarget": {u"@type": u"@id"},
u"cwltool": "http://commonwl.org/cwltool#",
u"overrides": {
"@id": "cwltool:overrides",
"mapSubject": "overrideTarget",
"mapPredicate": "override"
},
u"override": {
"@id": "cwltool:override",
"mapSubject": "class"
}
} # type: Dict[Text, Union[Dict[Any, Any], Text, Iterable[Text]]]

def resolve_tool_uri(argsworkflow, # type: Text
resolver=None, # type: Callable[[Loader, Union[Text, Dict[Text, Any]]], Text]
fetcher_constructor=None,
# type: Callable[[Dict[Text, Text], requests.sessions.Session], Fetcher]
document_loader=None # type: Loader
):
# type: (...) -> Tuple[Text, Text]

uri = None # type: Text
split = urllib.parse.urlsplit(argsworkflow)
# In case of Windows path, urlsplit misjudge Drive letters as scheme, here we are skipping that
if split.scheme and split.scheme in [u'http',u'https',u'file']:
uri = argsworkflow
elif os.path.exists(os.path.abspath(argsworkflow)):
uri = file_uri(str(os.path.abspath(argsworkflow)))
elif resolver:
if document_loader is None:
document_loader = Loader(jobloaderctx, fetcher_constructor=fetcher_constructor) # type: ignore
uri = resolver(document_loader, argsworkflow)

if uri is None:
raise ValidationException("Not found: '%s'" % argsworkflow)

if argsworkflow != uri:
_logger.info("Resolved '%s' to '%s'", argsworkflow, uri)

fileuri = urllib.parse.urldefrag(uri)[0]
return uri, fileuri


def fetch_document(argsworkflow, # type: Union[Text, Dict[Text, Any]]
resolver=None, # type: Callable[[Loader, Union[Text, Dict[Text, Any]]], Text]
fetcher_constructor=None
Expand All @@ -49,22 +96,7 @@ def fetch_document(argsworkflow, # type: Union[Text, Dict[Text, Any]]
uri = None # type: Text
workflowobj = None # type: CommentedMap
if isinstance(argsworkflow, string_types):
split = urllib.parse.urlsplit(argsworkflow)
# In case of Windows path, urlsplit misjudge Drive letters as scheme, here we are skipping that
if split.scheme and split.scheme in [u'http',u'https',u'file']:
uri = argsworkflow
elif os.path.exists(os.path.abspath(argsworkflow)):
uri = file_uri(str(os.path.abspath(argsworkflow)))
elif resolver:
uri = resolver(document_loader, argsworkflow)

if uri is None:
raise ValidationException("Not found: '%s'" % argsworkflow)

if argsworkflow != uri:
_logger.info("Resolved '%s' to '%s'", argsworkflow, uri)

fileuri = urllib.parse.urldefrag(uri)[0]
uri, fileuri = resolve_tool_uri(argsworkflow, resolver=resolver, document_loader=document_loader)
workflowobj = document_loader.fetch(fileuri)
elif isinstance(argsworkflow, dict):
uri = "#" + Text(id(argsworkflow))
Expand Down Expand Up @@ -139,8 +171,9 @@ def validate_document(document_loader, # type: Loader
strict=True, # type: bool
preprocess_only=False, # type: bool
fetcher_constructor=None,
skip_schemas=None
skip_schemas=None,
# type: Callable[[Dict[Text, Text], requests.sessions.Session], Fetcher]
overrides=None # type: List[Dict]
):
# type: (...) -> Tuple[Loader, Names, Union[Dict[Text, Any], List[Dict[Text, Any]]], Dict[Text, Any], Text]
"""Validate a CWL document."""
Expand All @@ -155,9 +188,15 @@ def validate_document(document_loader, # type: Loader

jobobj = None
if "cwl:tool" in workflowobj:
jobobj, _ = document_loader.resolve_all(workflowobj, uri)
job_loader = Loader(jobloaderctx, fetcher_constructor=fetcher_constructor) # type: ignore
jobobj, _ = job_loader.resolve_all(workflowobj, uri)
uri = urllib.parse.urljoin(uri, workflowobj["https://w3id.org/cwl/cwl#tool"])
del cast(dict, jobobj)["https://w3id.org/cwl/cwl#tool"]

if "http://commonwl.org/cwltool#overrides" in jobobj:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is over-indented

overrides.extend(resolve_overrides(jobobj, uri, uri))
del jobobj["http://commonwl.org/cwltool#overrides"]

workflowobj = fetch_document(uri, fetcher_constructor=fetcher_constructor)[1]

fileuri = urllib.parse.urldefrag(uri)[0]
Expand Down Expand Up @@ -225,6 +264,9 @@ def validate_document(document_loader, # type: Loader
if jobobj:
metadata[u"cwl:defaults"] = jobobj

if overrides:
metadata[u"cwltool:overrides"] = overrides

return document_loader, avsc_names, processobj, metadata, uri


Expand Down Expand Up @@ -277,14 +319,29 @@ def load_tool(argsworkflow, # type: Union[Text, Dict[Text, Any]]
enable_dev=False, # type: bool
strict=True, # type: bool
resolver=None, # type: Callable[[Loader, Union[Text, Dict[Text, Any]]], Text]
fetcher_constructor=None # type: Callable[[Dict[Text, Text], requests.sessions.Session], Fetcher]
fetcher_constructor=None, # type: Callable[[Dict[Text, Text], requests.sessions.Session], Fetcher]
overrides=None
):
# type: (...) -> Process

document_loader, workflowobj, uri = fetch_document(argsworkflow, resolver=resolver,
fetcher_constructor=fetcher_constructor)
document_loader, avsc_names, processobj, metadata, uri = validate_document(
document_loader, workflowobj, uri, enable_dev=enable_dev,
strict=strict, fetcher_constructor=fetcher_constructor)
strict=strict, fetcher_constructor=fetcher_constructor,
overrides=overrides)
return make_tool(document_loader, avsc_names, metadata, uri,
makeTool, kwargs if kwargs else {})

def resolve_overrides(ov, ov_uri, baseurl): # type: (CommentedMap, Text, Text) -> List[Dict[Text, Any]]
ovloader = Loader(overrides_ctx)
ret, _ = ovloader.resolve_all(ov, baseurl)
if not isinstance(ret, CommentedMap):
raise Exception("Expected CommentedMap, got %s" % type(ret))
cwl_docloader = get_schema("v1.0")[0]
cwl_docloader.resolve_all(ret, ov_uri)
return ret["overrides"]

def load_overrides(ov, base_url): # type: (Text, Text) -> List[Dict[Text, Any]]
ovloader = Loader(overrides_ctx)
return resolve_overrides(ovloader.fetch(ov), ov, base_url)
Loading