Skip to content

Commit 6a67f62

Browse files
authored
Merge pull request #27 from common-workflow-language/custom-types
Custom Types Lesson
2 parents ab19fc7 + 3543b25 commit 6a67f62

File tree

5 files changed

+214
-0
lines changed

5 files changed

+214
-0
lines changed

_episodes/19-custom-types.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
title: "Custom Types"
3+
teaching: 10
4+
exercises: 0
5+
questions:
6+
- "How do I create and import my own custom types into a CWL description?"
7+
objectives:
8+
- "Learn how to write custom CWL object types."
9+
- "Learn how to import these custom objects into a tool description."
10+
keypoints:
11+
- "You can create your own custom types to load into descriptions."
12+
- "These custom types allow the user to configure the behaviour of a tool
13+
without tinkering directly with the tool description."
14+
- "Custom types are described in separate YAML files and imported as needed."
15+
---
16+
17+
Sometimes you may want to write your own custom types for use and reuse in CWL
18+
descriptions. Use of such custom types can reduce redundancy between multiple
19+
descriptions that all use the same type, and also allow for additional
20+
customisation/configuration of a tool/analysis without the need to fiddle with
21+
the CWL description directly.
22+
23+
The example below is a CWL description of the [InterProScan][ips] tool for
24+
simultaneously searching protein sequences against a wide variety of resources.
25+
It is a good example of a number of good practices in CWL.
26+
27+
*custom-types.cwl*
28+
29+
~~~
30+
{% include cwl/custom-types.cwl %}
31+
~~~
32+
{: .source}
33+
34+
*custom-types.yml*
35+
36+
~~~
37+
{% include cwl/custom-types.yml %}
38+
~~~
39+
{: .source}
40+
41+
On line 34, in `inputs:applications`, a list of applications to be used in the
42+
search are imported as a custom object:
43+
44+
```
45+
inputs:
46+
proteinFile:
47+
type: File
48+
inputBinding:
49+
prefix: --input
50+
applications:
51+
type: InterProScan-apps.yml#apps[]?
52+
inputBinding:
53+
itemSeparator: ','
54+
prefix: --applications
55+
```
56+
{: .source}
57+
58+
The reference to a custom type is a combination of the name of the file in which
59+
the object is defined (`InterProScan-apps.yml`) and the name of the object
60+
within that file (`apps`) that defines the custom type. The square brackets `[]`
61+
mean that an array of the preceding type is expected, in this case the `apps`
62+
type from the imported `InterProScan-apps.yaml` file
63+
64+
The contents of the YAML file describing the custom type are given below:
65+
66+
~~~
67+
{% include cwl/InterProScan-apps.yml %}
68+
~~~
69+
{: .source}
70+
71+
In order for the custom type to be used in the CWL description, it must be
72+
imported. Imports are described in `requirements:SchemaDefRequirement`, as
73+
below in the example `custom-types.cwl` description:
74+
75+
```
76+
requirements:
77+
ResourceRequirement:
78+
ramMin: 10240
79+
coresMin: 3
80+
SchemaDefRequirement:
81+
types:
82+
- $import: InterProScan-apps.yml
83+
```
84+
{: .source}
85+
86+
Note also that the author of this CWL description has also included
87+
`ResourceRequirement`s, specifying the minimum amount of RAM and number of cores
88+
required for the tool to run successfully, as well as details of the version of
89+
the software that the description was written for and other useful metadata.
90+
These features are discussed further in other chapters of this user guide.
91+
92+
[ips]: https://github.com/ebi-pf-team/interproscan

_includes/cwl/InterProScan-apps.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
type: enum
2+
name: apps
3+
symbols:
4+
- TIGRFAM
5+
- SFLD
6+
- SUPERFAMILY
7+
- Gene3D
8+
- Hamap
9+
- Coils
10+
- ProSiteProfiles
11+
- SMART
12+
- CDD
13+
- PRINTS
14+
- PIRSF
15+
- ProSitePatterns
16+
- Pfam
17+
- ProDom
18+
- MobiDBLite

_includes/cwl/custom-types.cwl

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
cwlVersion: v1.0
2+
class: CommandLineTool
3+
4+
label: "InterProScan: protein sequence classifier"
5+
6+
doc: |
7+
Version 5.21-60 can be downloaded here:
8+
https://github.com/ebi-pf-team/interproscan/wiki/HowToDownload
9+
10+
Documentation on how to run InterProScan 5 can be found here:
11+
https://github.com/ebi-pf-team/interproscan/wiki/HowToRun
12+
13+
requirements:
14+
ResourceRequirement:
15+
ramMin: 10240
16+
coresMin: 3
17+
SchemaDefRequirement:
18+
types:
19+
- $import: InterProScan-apps.yml
20+
21+
hints:
22+
SoftwareRequirement:
23+
packages:
24+
interproscan:
25+
specs: [ "https://identifiers.org/rrid/RRID:SCR_005829" ]
26+
version: [ "5.21-60" ]
27+
28+
inputs:
29+
proteinFile:
30+
type: File
31+
inputBinding:
32+
prefix: --input
33+
applications:
34+
type: InterProScan-apps.yml#apps[]?
35+
inputBinding:
36+
itemSeparator: ','
37+
prefix: --applications
38+
39+
baseCommand: interproscan.sh
40+
41+
arguments:
42+
- valueFrom: $(inputs.proteinFile.nameroot).i5_annotations
43+
prefix: --outfile
44+
- valueFrom: TSV
45+
prefix: --formats
46+
- --disable-precalc
47+
- --goterms
48+
- --pathways
49+
- valueFrom: $(runtime.tmpdir)
50+
prefix: --tempdir
51+
52+
53+
outputs:
54+
i5Annotations:
55+
type: File
56+
format: iana:text/tab-separated-values
57+
outputBinding:
58+
glob: $(inputs.proteinFile.nameroot).i5_annotations
59+
60+
$namespaces:
61+
iana: https://www.iana.org/assignments/media-types/
62+
s: http://schema.org/
63+
$schemas:
64+
- https://schema.org/docs/schema_org_rdfa.html
65+
66+
s:license: "https://www.apache.org/licenses/LICENSE-2.0"
67+
s:copyrightHolder: "EMBL - European Bioinformatics Institute"

_includes/cwl/custom-types.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
proteinFile:
2+
class: File
3+
path: test_proteins.fasta

_includes/cwl/test_proteins.fasta

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
>Q97R95
2+
MKYKRIVFKVGTSSLTNEDGSLSRSKVKDITQQLAMLHEAGHELILVSSGAIAAGFGALG
3+
FKKRPTKIADKQASAAVGQGLLLEEYTTNLLLRQIVSAQILLTQDDFVDKRRYKNAHQAL
4+
SVLLNRGAIPIINENDSVVIDELKVGDNDTLSAQVAAMVQADLLVFLTDVDGLYTGNPNS
5+
DPRAKRLERIETINREIIDMAGGAGSSNGTGGMLTKIKAATIATESGVPVYICSSLKSDS
6+
MIEAAEETEDGSYFVAQEKGLRTQKQWLAFYAQSQGSIWVDKGAAEALSQYGKSLLLSGI
7+
VEAEGVFSYGDIVTVFDKESGKSLGKGRVQFGASALEDMLRSQKAKGVLIYRDDWISITP
8+
EIQLLFTEF
9+
>A2VDN9
10+
MEVKGKKKLTGKGTKMSQEKSKFHKNNDSGSSKTFPKKVVKEGGPKITSKNFEKTATKPGKKGVKQFKNKQQGDRIPKNK
11+
FQQANKFNQKRKFQPDSKSDESAAKKPKWDEFKKKKKELKQSRQLSDKTNYDIVIRAKQIWEILRRKDCDKEKRVKLMSD
12+
LQKLIQGKIKTIAFAHDSTRVIQCYIQFGNEEQRKQAFEELRGDLVELSKAKYSRNIVKKFLMYGSKAQIAEIIRSFKGH
13+
VRKLLRHAEASAIVEYAYNDKAILEQRNMLTEELYGNTFQLYKSADHPTLDKVLEVQPEKLELIMDEMKQILTPMAQKEA
14+
VIKHSLVHKVFLDFFTYAPPKLRSEMIEAIREAVVYLAHTHDGARVAMYCLWHGTPKDRKVIVKTMKTYIEKVANGQYSH
15+
LVLLAAFDCIDDTKLVKQIIISEIINSLPNIVNDKYGRKVLLYLLSPRDPAHTVREIIEVLQKGDGNAHSKKDTEIRRRE
16+
LLESISPALLSYLQGHAQEVVLDKSACVLVADILGTATGDVQPAMDAVASLAAAELHPGGKDGELHIAEHPAGHLVLKWL
17+
IEQDKKMKERGREGCFAKTLIERVGVKNLKSWASVNRGAIILSSLLQSSDQEVANKVKAGLKSLIPALEKSKNTSKGIEM
18+
LLEKLTA
19+
>A2YIW7
20+
MAAEEGVVIACHNKDEFDAQMTKAKEAGKVVIIDFTASWCGPCRFIAPVFAEYAKKFPGAVFLKVDVDELKEVAEKYNVE
21+
AMPTFLFIKDGAEADKVVGARKDDLQNTIVKHVGATAASASA
22+
>P22298
23+
GRGLLPFVLLALGIXAPWAVEGAENALKGGACPPRKIVQCLRYEKPKCTSDWQCPDKKKC
24+
CRDTCAIKCLNPVAITNPVKVKPGKCPVVYGQCMMLNPPNHCKTDSQCLGDLKCCKSMCG
25+
KVCLTPVKA
26+
>A0B6J9
27+
MSKIGKSIRLERIIDRKTRKTVIVPMDHGLTVGPIPGLIDLAAAVDKVAEGGANAVLGHM
28+
GLPLYGHRGYGKDVGLIIHLSASTSLGPDANHKVLVTRVEDAIRVGADGVSIHVNVGAED
29+
EAEMLRDLGMVARRCDLWGMPLLAMMYPRGAKVRSEHSVEYVKHAARVGAELGVDIVKTN
30+
YTGSPETFREVVRGCPAPVVIAGGPKMDTEADLLQMVYDAMQAGAAGISIGRNIFQAENP
31+
TLLTRKLSKIVHEGYTPEEAARLKL
32+
>P02939
33+
MNRTKLVLGAVILGSTLLAGCSSNAKIDQLSTDVQTLNAKVDQLSNDVTAIRSDVQAAKD
34+
DAARANQRLDNQAHSYRK

0 commit comments

Comments
 (0)