Skip to content

Commit

Permalink
runner.aws_batch: Download .snakemake/metadata/ too
Browse files Browse the repository at this point in the history
Snakemake stores state information per input/output here and uses it to
determine if it needs to re-run rules or not.  It seems akin to the file
mtimes which we already take care to preserve on upload/download.
Additionally, the metadata recorded is used in Snakemake's report
generation and is generally useful for looking at workflow statistics.

Continue to not download all of .snakemake/ en masse because it can
potentially contain files that interfere with local usage and/or are
large and unnecessary.

Resolves: <#373>
Related-to: <nextstrain/docker-base#220>
  • Loading branch information
tsibley committed Jun 17, 2024
1 parent 8ed779c commit 1a3ba39
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 1 deletion.
11 changes: 11 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,17 @@ development source code and as such may not be routinely kept up to date.

# __NEXT__

## Improvements

* Snakemake's per-input/output file metadata (stored in `.snakemake/metadata/`)
is now downloaded from AWS Batch builds by default. Like file modification
times (mtimes), which are already preserved from the remote build, this
additional metadata is used by Snakemake to track when inputs have changed
and when it should regenerate outputs. The metadata is also used in
[Snakemake report generation](https://snakemake.readthedocs.io/en/v8.14.0/snakefiles/reporting.html#rendering-reports)
and can be useful for gathering ad-hoc workflow statistics.
([#374](https://github.com/nextstrain/cli/pull/374))


# 8.4.0 (29 May 2024)

Expand Down
6 changes: 5 additions & 1 deletion nextstrain/cli/runner/aws_batch/s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,8 +119,12 @@ def download_workdir(remote_workdir: S3Object, workdir: Path, patterns: List[str
])

included = path_matcher([
# But we do want the Snakemake logs to come over.
# But we do want the Snakemake logs to come over
".snakemake/log/",

# …and the input/output metadata Snakemake tracks (akin to mtimes,
# which we also preserve).
".snakemake/metadata/",
])

if patterns:
Expand Down

0 comments on commit 1a3ba39

Please sign in to comment.