Skip to content

Commit 444d892

Browse files
author
Peter Amstutz
committed
Expanded discussion of File and Directory types.
1 parent a486e91 commit 444d892

File tree

3 files changed

+129
-5
lines changed

3 files changed

+129
-5
lines changed

v1.0/CommandLineTool.yml

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,12 +80,15 @@ $graph:
8080
Post v1.0 release changes to the spec.
8181
8282
* 13 July 2016: Mark `baseCommand` as optional and update descriptive text.
83-
* 12 March 2017: (v1.0.1)
83+
* 12 March 2017:
8484
* Mark `default` as not required for link checking.
85-
* Add note that recursive subworkflows is not allowed.
8685
* Add note that files in InitialWorkDir must have path in output directory.
8786
* Add note that writable: true applies recursively.
88-
* Fix mistake in discussion of extracting field names from workflow step ids.
87+
* 21 July 2017:
88+
* Add clarification about scattering over empty arrays.
89+
* Clarify interpretation of secondaryFiles on inputs.
90+
* 22 July 2017: (v1.0.1)
91+
* Expanded discussion of semantics of File and Directory types
8992
9093
## Purpose
9194

v1.0/Process.yml

Lines changed: 109 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,73 @@ $graph:
5353
type: record
5454
docParent: "#CWLType"
5555
doc: |
56-
Represents a file (or group of files if `secondaryFiles` is specified) that
57-
must be accessible by tools using standard POSIX file system call API such as
56+
Represents a file (or group of files when `secondaryFiles` is provided) that
57+
will be accessible by tools using standard POSIX file system call API such as
5858
open(2) and read(2).
59+
60+
Files are represented as objects with `class` of `File`. File objects have
61+
a number of properties that provide metadata about the file.
62+
63+
The `location` property of a File is a URI that uniquely identifies the
64+
file. Implementations must support the file:// URI scheme and may support
65+
other schemes such as http://. The value of `location` may also be a
66+
relative reference, in which case it must be resolved relative to the URI
67+
of the document it appears in. Alternately to `location`, implementations
68+
must also accept the `path` property on File, which must be a filesystem
69+
path available on the same host as the CWL runner (for inputs) or the
70+
runtime environment of a command line tool execution (for command line tool
71+
outputs).
72+
73+
If no `location` or `path` is specified, a file object must specify
74+
`contents` with the UTF-8 text content of the file. This is a "file
75+
literal". File literals do not correspond to external resources, but are
76+
created on disk with `contents` with when needed for a executing a tool.
77+
Where appropriate, expressions can return file literals to define new files
78+
on a runtime. The maximum size of `contents` is 64 kilobytes.
79+
80+
The `basename` property defines the filename on disk where the file is
81+
staged. This may differ from the resource name. If not provided,
82+
`basename` must be computed from the last path part of `location` and made
83+
available to expressions.
84+
85+
The `secondaryFiles` property is a list of File or Directory objects that
86+
must be staged in the same directory as the primary file. It is an error
87+
for file names to be duplicated in `secondaryFiles`.
88+
89+
The `size` property is the size in bytes of the File. It must be computed
90+
from the resource and made available to expressions. The `checksum` field
91+
contains a cryptographic hash of the file content for use it verifying file
92+
contents. Implementations may, at user option, enable or disable
93+
computation of the `checksum` field for performance or other reasons.
94+
However, the ability to compute output checksums is required to pass the
95+
CWL conformance test suite.
96+
97+
When executing a CommandLineTool, the files and secondary files may be
98+
staged to an arbitrary directory, but must use the value of `basename` for
99+
the filename. The `path` property must be file path in the context of the
100+
tool execution runtime (local to the compute node, or within the executing
101+
container). All computed properties should be available to expressions.
102+
File literals also must be staged and `path` must be set.
103+
104+
When collecting CommandLineTool outputs, `glob` matching returns file paths
105+
(with the `path` property) and the derived properties. This can all be
106+
modified by `outputEval`. Alternately, if the file `cwl.outputs.json` is
107+
present in the output, `outputBinding` is ignored.
108+
109+
File objects in the output must provide either a `location` URI or a `path`
110+
property in the context of the tool execution runtime (local to the compute
111+
node, or within the executing container).
112+
113+
When evaluating an ExpressionTool, file objects must be referenced via
114+
`location` (the expression tool does not have access to files on disk so
115+
`path` is meaningless) or as file literals. It is legal to return a file
116+
object with an existing `location` but a different `basename`. The
117+
`loadContents` field of ExpressionTool inputs behaves the same as on
118+
CommandLineTool inputs, however it is not meaningful on the outputs.
119+
120+
An ExpressionTool may forward file references from input to output by using
121+
the same value for `location`.
122+
59123
fields:
60124
- name: class
61125
type:
@@ -221,6 +285,49 @@ $graph:
221285
docAfter: "#File"
222286
doc: |
223287
Represents a directory to present to a command line tool.
288+
289+
Directories are represented as objects with `class` of `Directory`. Directory objects have
290+
a number of properties that provide metadata about the directory.
291+
292+
The `location` property of a Directory is a URI that uniquely identifies
293+
the directory. Implementations must support the file:// URI scheme and may
294+
support other schemes such as http://. Alternately to `location`,
295+
implementations must also accept the `path` property on Direcotry, which
296+
must be a filesystem path available on the same host as the CWL runner (for
297+
inputs) or the runtime environment of a command line tool execution (for
298+
command line tool outputs).
299+
300+
A Directory object may have a `listing` field. This is a list of File and
301+
Directory objects that are contained in the Directory. For each entry in
302+
`listing`, the `basename` property defines the name of the File or
303+
Subdirectory when staged to disk. If `listing` is not provided, the
304+
implementation must have some way of fetching the Directory listing at
305+
runtime based on the `location` field.
306+
307+
If a Directory does not have `location`, it is a Directory literal. A
308+
Directory literal must provide `listing`. Directory literals must be
309+
created on disk at runtime as needed.
310+
311+
The resources in a Directory literal do not need to have any implied
312+
relationship in their `location`. For example, a Directory listing may
313+
contain two files located on different hosts. It is the responsibility of
314+
the runtime to ensure that those files are staged to disk appropriately.
315+
Secondary files associated with files in `listing` must also be staged to
316+
the same Directory.
317+
318+
When executing a CommandLineTool, Directories must be recursively staged
319+
first and have local values of `path` assigend.
320+
321+
Directory objects in CommandLineTool output must provide either a
322+
`location` URI or a `path` property in the context of the tool execution
323+
runtime (local to the compute node, or within the executing container).
324+
325+
An ExpressionTool may forward file references from input to output by using
326+
the same value for `location`.
327+
328+
Name conflicts (the same `basename` appearing multiple times in `listing`
329+
or in any entry in `secondaryFiles` in the listing) is a fatal error.
330+
224331
fields:
225332
- name: class
226333
type:

v1.0/Workflow.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,20 @@ $graph:
6666
```
6767
* The common field `description` has been renamed to `doc`.
6868
69+
## Errata
70+
71+
Post v1.0 release changes to the spec.
72+
73+
* 12 March 2017:
74+
* Mark `default` as not required for link checking.
75+
* Add note that recursive subworkflows is not allowed.
76+
* Fix mistake in discussion of extracting field names from workflow step ids.
77+
* 21 July 2017:
78+
* Add clarification about scattering over empty arrays.
79+
* Clarify interpretation of secondaryFiles on inputs.
80+
* 22 July 2017: (v1.0.1)
81+
* Expanded discussion of semantics of File and Directory types
82+
6983
## Purpose
7084
7185
The Common Workflow Language Command Line Tool Description express

0 commit comments

Comments
 (0)