Skip to content

Commit

Permalink
fix: Treat geolocation-rules.tsv and annotations.tsv as input files
Browse files Browse the repository at this point in the history
This commit modifies the configurations and snakemake rules to treat geolocation-rules.tsv and annotations.tsv as input files instead of strings. This change addresses a warning message that occurs when using relative file paths starting with './'. The warning suggests omitting the './' to avoid redundancy and ensure consistent file matching in Snakemake.

```
Relative file path './source-data/geolocation-rules.tsv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
```

Furthermore, by treating source-data/annotations.tsv as a file, it provides hints to snakemake modules to add a prefix, such as "ingest/source-data/", if necessary. This modification aligns with the changes made in the following pull request: nextstrain/dengue#10
  • Loading branch information
j23414 committed Jun 23, 2023
1 parent 886a2e9 commit 8ff2c22
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
4 changes: 2 additions & 2 deletions ingest/config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,9 @@ transform:
geolocation_rules_url: 'https://raw.githubusercontent.com/nextstrain/ncov-ingest/master/source-data/gisaid_geoLocationRules.tsv'
# Local geolocation rules that are only applicable to rsv data
# Local rules can overwrite the general geolocation rules provided above
local_geolocation_rules: './source-data/geolocation-rules.tsv'
local_geolocation_rules: 'source-data/geolocation-rules.tsv'
# User annotations file
annotations: './source-data/annotations.tsv'
annotations: 'source-data/annotations.tsv'
# ID field used to merge annotations
annotations_id: 'accession'
# Field to use as the sequence ID in the FASTA file
Expand Down
6 changes: 3 additions & 3 deletions ingest/workflow/snakemake_rules/transform.smk
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ rule concat_geolocation_rules:
rule transform:
input:
sequences_ndjson = "data/sequences.ndjson",
all_geolocation_rules = "data/all-geolocation-rules.tsv"
all_geolocation_rules = "data/all-geolocation-rules.tsv",
annotations = config['transform']['annotations'],
output:
metadata = "data/metadata.tsv",
sequences = "data/sequences.fasta"
Expand All @@ -57,7 +58,6 @@ rule transform:
authors_field = config['transform']['authors_field'],
authors_default_value = config['transform']['authors_default_value'],
abbr_authors_field = config['transform']['abbr_authors_field'],
annotations = config['transform']['annotations'],
annotations_id = config['transform']['annotations_id'],
metadata_columns = config['transform']['metadata_columns'],
id_field = config['transform']['id_field'],
Expand Down Expand Up @@ -86,7 +86,7 @@ rule transform:
| ./bin/apply-geolocation-rules \
--geolocation-rules {input.all_geolocation_rules} \
| ./bin/merge-user-metadata \
--annotations {params.annotations} \
--annotations {input.annotations} \
--id-field {params.annotations_id} \
| ./bin/ndjson-to-tsv-and-fasta \
--fasta {output.sequences} \
Expand Down

0 comments on commit 8ff2c22

Please sign in to comment.