Skip to content
This repository was archived by the owner on Aug 20, 2024. It is now read-only.

Commit 83a6b5c

Browse files
authored
Merge branch 'master' into feat/format-filepathpattern
2 parents 90983d0 + 4884fae commit 83a6b5c

File tree

10 files changed

+160
-8
lines changed

10 files changed

+160
-8
lines changed

CHANGELOG.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
# nextflow-io/nf-validation: Changelog
22

3-
# Version 1.1.0
3+
# Version 1.1.0 - Miso
4+
5+
## Features
6+
7+
- Add support for samplesheets with no header ([#115](https://github.com/nextflow-io/nf-validation/pull/115))
48

59
## Bug fixes
610

@@ -11,7 +15,7 @@
1115

1216
- Added `file-path-pattern` format to check every file fetched using a glob pattern. Using a glob is now also possible in the samplesheet and will create a list of all files found using that glob pattern. ([#118](https://github.com/nextflow-io/nf-validation/pull/118))
1317

14-
# Version 1.0.0
18+
# Version 1.0.0 - Tonkotsu
1519

1620
The nf-validation plugin is now in production use across many pipelines and has (we hope) now reached a point of relative stability. The bump to major version v1.0.0 signifies that it is suitable for use in production pipelines.
1721

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,10 @@ ch_input = Channel.fromSamplesheet("input")
6060
- Java 11 or later
6161
- <https://github.com/everit-org/json-schema>
6262

63+
## Slack channel
64+
65+
There is a dedicated [nf-validation Slack channel](https://nfcore.slack.com/archives/C056RQB10LU) in the [Nextflow Slack workspace](nextflow.slack.com).
66+
6367
## Credits
6468

6569
This plugin was written based on code initially written within the nf-core community,

docs/samplesheets/examples.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,47 @@ tuple val(meta), path(fastq_1), path(fastq_2), path(bed)
4444

4545
It may be necessary to manipulate this channel to fit your process inputs. For more documentation, check out the [Nextflow operator docs](https://www.nextflow.io/docs/latest/operator.html), however here are some common use cases with `.fromSamplesheet()`.
4646

47+
## Using a samplesheet with no headers
48+
49+
Sometimes you only have one possible input in the pipeline samplesheet. In this case it doesn't make sense to have a header in the samplesheet. This can be done by creating a samplesheet with an empty string as input key:
50+
51+
```json
52+
{
53+
"$schema": "http://json-schema.org/draft-07/schema",
54+
"description": "Schema for the file provided with params.input",
55+
"type": "array",
56+
"items": {
57+
"type": "object",
58+
"properties": {
59+
"": {
60+
"type": "string"
61+
}
62+
}
63+
}
64+
}
65+
```
66+
67+
When using samplesheets like this CSV file:
68+
69+
```csv
70+
test_1
71+
test_2
72+
```
73+
74+
or this YAML file:
75+
76+
```yaml
77+
- test_1
78+
- test_2
79+
```
80+
81+
The output of `.fromSamplesheet()` will look like this:
82+
83+
```bash
84+
[test_1]
85+
[test_2]
86+
```
87+
4788
## Changing the structure of channel items
4889

4990
Each item in the channel will be a flat tuple, but some processes will use multiple files as a list in their input channel, this is common in nf-core modules. For example, consider the following input declaration in a process, where FASTQ could be > 1 file:

plugins/nf-validation/src/main/nextflow/validation/SamplesheetConverter.groovy

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -64,17 +64,23 @@ class SamplesheetConverter {
6464
def Map<String, Map<String, String>> schemaFields = (Map) schemaMap["items"]["properties"]
6565
def Set<String> allFields = schemaFields.keySet()
6666
def List<String> requiredFields = (List) schemaMap["items"]["required"]
67+
def Boolean containsHeader = !(allFields.size() == 1 && allFields[0] == "")
6768

6869
def String fileType = getFileType(samplesheetFile)
6970
def String delimiter = fileType == "csv" ? "," : fileType == "tsv" ? "\t" : null
7071
def List<Map<String,String>> samplesheetList
7172

7273
if(fileType == "yaml"){
73-
samplesheetList = new Yaml().load((samplesheetFile.text))
74+
samplesheetList = new Yaml().load((samplesheetFile.text)).collect {
75+
if(containsHeader) {
76+
return it as Map
77+
}
78+
return ["empty": it] as Map
79+
}
7480
}
7581
else {
7682
Path fileSamplesheet = Nextflow.file(samplesheetFile) as Path
77-
samplesheetList = fileSamplesheet.splitCsv(header:true, strip:true, sep:delimiter, quote:'"')
83+
samplesheetList = fileSamplesheet.splitCsv(header:containsHeader ?: ["empty"], strip:true, sep:delimiter, quote:'"')
7884
}
7985

8086
// Field checks + returning the channels
@@ -83,17 +89,16 @@ class SamplesheetConverter {
8389
def Boolean headerCheck = true
8490
this.rows = []
8591
resetCount()
86-
8792
def List outputs = samplesheetList.collect { Map<String,String> fullRow ->
8893
increaseCount()
8994

9095
Map<String,String> row = fullRow.findAll { it.value != "" }
91-
def Set rowKeys = row.keySet()
96+
def Set rowKeys = containsHeader ? row.keySet() : ["empty"].toSet()
9297
def String yamlInfo = fileType == "yaml" ? " for entry ${this.getCount()}." : ""
9398

9499
// Check the header (CSV/TSV) or present fields (YAML)
95100
if(headerCheck) {
96-
def unexpectedFields = rowKeys - allFields
101+
def unexpectedFields = containsHeader ? rowKeys - allFields : []
97102
if(unexpectedFields.size() > 0) {
98103
this.warnings << "The samplesheet contains following unchecked field(s): ${unexpectedFields}${yamlInfo}".toString()
99104
}
@@ -114,7 +119,7 @@ class SamplesheetConverter {
114119
def ArrayList output = []
115120

116121
for( Map.Entry<String, Map> field : schemaFields ){
117-
def String key = field.key
122+
def String key = containsHeader ? field.key : "empty"
118123
def String input = row[key]
119124

120125
// Check if the field is deprecated

plugins/nf-validation/src/test/nextflow/validation/SamplesheetConverterTest.groovy

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,56 @@ class SamplesheetConverterTest extends Dsl2Spec{
137137
stdout.contains("[[string1:extraField, string2:extraField, integer1:10, integer2:10, boolean1:true, boolean2:true], string1, 25, false, ${this.getRootString()}/src/testResources/test.txt, ${this.getRootString()}/src/testResources/testDir, ${this.getRootString()}/src/testResources/testDir, unique3, 1, itDoesExist]" as String)
138138
}
139139

140+
def 'no header - CSV' () {
141+
given:
142+
def SCRIPT_TEXT = '''
143+
include { fromSamplesheet } from 'plugin/nf-validation'
144+
145+
params.input = 'src/testResources/no_header.csv'
146+
147+
workflow {
148+
Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_no_header.json").view()
149+
}
150+
'''
151+
152+
when:
153+
dsl_eval(SCRIPT_TEXT)
154+
def stdout = capture
155+
.toString()
156+
.readLines()
157+
.findResults {it.startsWith('[') ? it : null }
158+
159+
then:
160+
noExceptionThrown()
161+
stdout.contains("[test_1]")
162+
stdout.contains("[test_2]")
163+
}
164+
165+
def 'no header - YAML' () {
166+
given:
167+
def SCRIPT_TEXT = '''
168+
include { fromSamplesheet } from 'plugin/nf-validation'
169+
170+
params.input = 'src/testResources/no_header.yaml'
171+
172+
workflow {
173+
Channel.fromSamplesheet("input", parameters_schema:"src/testResources/nextflow_schema_with_samplesheet_no_header.json").view()
174+
}
175+
'''
176+
177+
when:
178+
dsl_eval(SCRIPT_TEXT)
179+
def stdout = capture
180+
.toString()
181+
.readLines()
182+
.findResults {it.startsWith('[') ? it : null }
183+
184+
then:
185+
noExceptionThrown()
186+
stdout.contains("[test_1]")
187+
stdout.contains("[test_2]")
188+
}
189+
140190
def 'extra field' () {
141191
given:
142192
def SCRIPT_TEXT = '''
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
{
2+
"$schema": "http://json-schema.org/draft-07/schema",
3+
"$id": "https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json",
4+
"title": "nf-core/testpipeline pipeline parameters",
5+
"description": "this is a test",
6+
"type": "object",
7+
"definitions": {
8+
"input_output_options": {
9+
"title": "Input/output options",
10+
"type": "object",
11+
"fa_icon": "fas fa-terminal",
12+
"description": "Define where the pipeline should find input data and save output data.",
13+
"required": ["input"],
14+
"properties": {
15+
"input": {
16+
"type": "string",
17+
"format": "file-path",
18+
"mimetype": "text/csv",
19+
"pattern": "^\\S+\\.csv$",
20+
"schema": "src/testResources/no_header_schema.json",
21+
"description": "Path to comma-separated file containing information about the samples in the experiment.",
22+
"help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).",
23+
"fa_icon": "fas fa-file-csv"
24+
}
25+
}
26+
}
27+
}
28+
}
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
test_1
2+
test_2
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
- test_1
2+
- test_2
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"$schema": "http://json-schema.org/draft-07/schema",
3+
"description": "Schema for the file provided with params.input",
4+
"type": "array",
5+
"items": {
6+
"type": "object",
7+
"properties": {
8+
"": {
9+
"type": "string"
10+
}
11+
}
12+
}
13+
}
14+
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
test_1
2+
test_2

0 commit comments

Comments
 (0)