Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add B.1 build, downsample B.1 in non-B.1 builds, rename to mpox #171

Merged
merged 25 commits into from
Sep 20, 2023

Commits on Aug 9, 2023

  1. Snakefile: remove AUGUR_RECURSION_LIMIT

    The AUGUR_RECURSION_LIMIT is set to 10,000 by default since Augur 22.0.0
    and the workflow's minimum required Augur version is now 22.2.0¹
    
    ¹ https://github.com/nextstrain/monkeypox/blob/5216a13c407d0e842b40951907c49de6fc6b967a/Snakefile#L5
    joverlee521 committed Aug 9, 2023
    Configuration menu
    Copy the full SHA
    5761c36 View commit details
    Browse the repository at this point in the history

Commits on Aug 11, 2023

  1. Merge pull request #170 from nextstrain/remove-augur-recursion-limit

    Snakefile: remove AUGUR_RECURSION_LIMIT
    joverlee521 authored Aug 11, 2023
    Configuration menu
    Copy the full SHA
    12ccb42 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2023

  1. Move common values to default config, allow partial overrides

    Previously, either config_hmpxv1.yaml was used as the default as-is or
    all required configuration were to be specified via Snakemake CLI
    options. This resulted in much duplication of configuration across
    different usages, and the default was never actually used in practice
    except for in CI out of convenience of not specifying --configfile.
    
    Switch to a two-tiered configuration setup with (1) a base "default"
    tier of common values that can be applied to all workflow usages and (2)
    a second tier of usage-specific config values which can also partially override
    
    Since some required config entries (e.g. build_name) do not share common
    values across the existing configs, this results in a workflow that is
    only usable when additional configuration is defined. This will be
    addressed by subsequent commits.
    victorlin committed Aug 22, 2023
    Configuration menu
    Copy the full SHA
    55f7569 View commit details
    Browse the repository at this point in the history
  2. Set hmpxv1-focused defaults

    Copy values from the hmpxv1 configuration since that was the previous
    default, but change the names to be boilerplate defaults.
    
    Add required entries so that:
    
    1. The default config file serves as a reference for all required keys.
    2. The workflow can be run by CI without specifying an additional
       --configfile.
    victorlin committed Aug 22, 2023
    Configuration menu
    Copy the full SHA
    4140aff View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    fb81750 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2023

  1. Update CI workflow triggers

    Original reasoning by @tsibley in nextstrain/cli@fab709a:
    
    Running on push _and_ PRs is often redundant.  For PRs, we really care
    about the putative merge of the PR branch, and that's what "on:
    pull_request" tests.  We typically do not need push-level CI results for
    PRs.  On the other hand, CI results for every push to master are nice to
    have both as a safety backstop and for the linear chain of CI history it
    produces (e.g. to debug the impact of external changes on our CI).
    
    The primary downside I see is that you can no longer push without
    opening a PR just to see what CI says, but I think that's an acceptable
    tradeoff, especially now that draft PRs are a thing.  To mitigate this
    downside, "on: workflow_dispatch" allows CI to be manually dispatched
    for a specific branch/tag/commit if you _really_ don't want to open even
    a draft PR.
    
    Trimming unnecessary CI jobs reduces the time to completion for CI runs
    (good for the dev loop) and reduces organization-level job queuing,
    which can negatively impact the workflows of other repos.
    
    Co-authored-by: Thomas Sibley <tsibley@fredhutch.org>
    victorlin and tsibley committed Aug 30, 2023
    Configuration menu
    Copy the full SHA
    00f324e View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2023

  1. Configuration menu
    Copy the full SHA
    f96fa06 View commit details
    Browse the repository at this point in the history

Commits on Sep 1, 2023

  1. git subrepo pull (merge) ingest/vendored

    subrepo:
      subdir:   "ingest/vendored"
      merged:   "c97df23"
    upstream:
      origin:   "https://github.com/nextstrain/ingest"
      branch:   "main"
      commit:   "c97df23"
    git-subrepo:
      version:  "0.4.6"
      origin:   "https://github.com/ingydotnet/git-subrepo"
      commit:   "110b9eb"
    victorlin committed Sep 1, 2023
    Configuration menu
    Copy the full SHA
    afeec9c View commit details
    Browse the repository at this point in the history
  2. Use centralized scripts for NCBI Virus

    `fetch-from-ncbi-virus` from the subrepo is a copy of
    `fetch-from-genbank` with some extra options. Remove the copy in this
    repo and update references.
    
    The supporting scripts `csv-to-ndjson` and `genbank-url` can also be
    removed.
    victorlin committed Sep 1, 2023
    Configuration menu
    Copy the full SHA
    871817b View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2023

  1. Configuration menu
    Copy the full SHA
    b766498 View commit details
    Browse the repository at this point in the history

Commits on Sep 11, 2023

  1. ingest: Switch to NCBI Datasets CLI to fetch data

    Replace our custom fetch scripts that uses the NCBI Virus API with the
    NCBI Datasets CLI commands.
    
    NCBI datasets downloads a virus dataset ZIP file that includes a
    metadata JSON Lines file and a sequences FASTA file. To maintain a record
    of the single NDJSON file on S3, extract the sequences FASTA file and
    format the metadata into a TSV file that are parsed into a single NDJSON
    file using `augur curate passthru`. The metadata TSV is created using
    the NCBI `dataformat` command so that we do not have to parse the nested
    JSON lines files ourselves and header fields are renamed to match the
    previous fields we used for NCBI Virus.
    
    The NDJSON file created here no longer includes equivalent fields
    for "title" or "publication".
    joverlee521 committed Sep 11, 2023
    Configuration menu
    Copy the full SHA
    82ace30 View commit details
    Browse the repository at this point in the history
  2. ingest: snakefmt fixes

    joverlee521 committed Sep 11, 2023
    Configuration menu
    Copy the full SHA
    32eab02 View commit details
    Browse the repository at this point in the history

Commits on Sep 12, 2023

  1. Merge pull request #179 from nextstrain/ncbi-datasets-cli

    ingest: Switch to NCBI Datasets CLI to fetch data
    joverlee521 authored Sep 12, 2023
    Configuration menu
    Copy the full SHA
    fb0d318 View commit details
    Browse the repository at this point in the history
  2. rebuild-hmpxv1-big: bump memory

    Bumping to 68GiB to keep at c5.9xlarge instance since there's some
    required "headspace" for Batch¹
    
    ¹ https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1674586033950349?thread_ts=1674254476.788549&cid=C01LCTT7JNN
    joverlee521 committed Sep 12, 2023
    Configuration menu
    Copy the full SHA
    5969604 View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2023

  1. description: Update "Underlying data" section to NCBI Datasets

    This should have been done as a part of #179,
    but I totally missed that we have this section in the build's
    description.
    joverlee521 committed Sep 15, 2023
    Configuration menu
    Copy the full SHA
    68190a0 View commit details
    Browse the repository at this point in the history
  2. Merge pull request #180 from nextstrain/update-description

    description: Update "Underlying data" section to NCBI Datasets
    joverlee521 authored Sep 15, 2023
    Configuration menu
    Copy the full SHA
    942c1d0 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2023

  1. Allow trial builds through github action

    Add a config variable `auspice_prefix` that's prepended to auspice.json
    And allow setting of that variable in github action submission
    corneliusroemer committed Sep 20, 2023
    Configuration menu
    Copy the full SHA
    99eaf2a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    25e5cfa View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    cd969f7 View commit details
    Browse the repository at this point in the history
  4. Rename builds and sample differently

    New names no longer include the deprecated `monkeypox` name
    
    New mpxv and mpxv-IIb builds downsample B.1* sequences so that they don't overwhelm the tree
    
    Add an mpxv-IIb-B.1 build that contains only B.1* sequences
    corneliusroemer committed Sep 20, 2023
    Configuration menu
    Copy the full SHA
    b34b076 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ea5e21b View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    5019420 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    bb308a9 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    d87c2ef View commit details
    Browse the repository at this point in the history
  9. Update README.md

    corneliusroemer committed Sep 20, 2023
    Configuration menu
    Copy the full SHA
    c735b6a View commit details
    Browse the repository at this point in the history