Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Correct vep_cache_path_full when refseq/merged option is present #1563

Merged
merged 3 commits into from
Jun 12, 2024

Conversation

YeHW
Copy link
Contributor

@YeHW YeHW commented Jun 12, 2024

In this PR, I'm trying to fix this issue: ensembl-vep refseq/merged option broken.

Let's assume these param values:

vep_cache: path/to/vep_cache_dir
vep_cache_version: "111"
vep_genome: GRCh37
vep_species: homo_sapiens
vep_custom_args: "--everything --format vcf"

As of sarek 3.4.2, we can feed custom VEP options via params.vep_custom_args to the pipeline.

If --refseq or --merged is in params.vep_custom_args, it will be passed to VEP's final command, and result in the following effects:

  • If --refseq is provided, VEP will use ${vep_cache}/${vep_species}_refseq/${vep_cache_version}_${vep_genome} as the annotation source
  • If --merged is provided, VEP will use ${vep_cache}/${vep_species}_merged/${vep_cache_version}_${vep_genome} as the annotation source

Reference: https://useast.ensembl.org/info/docs/tools/vep/script/vep_options.html

When this happens (Let's assume --refseq is in params.vep_custom_args), in the ANNOTATION_CACHE_INITIALISATION workflow, a wrong vep_cache_path_full path will be tested if exists, because it won't check if params.vep_custom_args contains --refseq.

I modified the related code such that in the ANNOTATION_CACHE_INITIALISATION workflow, vep_cache_path_full will be built correctly by taking into consideration if --refseq or --merged is present in params.vep_custom_args.

I've tested with in-house data, and it worked as expected: when --refseq is present, VCF files will be annotated with RefSeq cache (e.g. "homo_sapiens_refseq/111_GRCh37"); when neither --refseq nor --merged is present, they will be annotated with normal cache (e.g. "homo_sapiens/111_GRCh37).

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nf-test test tests/ --verbose --profile +docker).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@YeHW YeHW self-assigned this Jun 12, 2024
Copy link

github-actions bot commented Jun 12, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit c051a5e

+| ✅ 200 tests passed       |+
#| ❔  12 tests were ignored |#
!| ❗   3 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: conf/modules.config
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_dark.png
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • actions_ci - actions_ci
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/sarek/sarek/.github/workflows/awstest.yml
  • template_strings - template_strings
  • modules_config - modules_config

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-06-12 09:37:59

@YeHW
Copy link
Contributor Author

YeHW commented Jun 12, 2024

I changed the signature of ANNOTATION_CACHE_INITIALISATION workflow, resulting this error (which is expected because I passed an additional parameter params.vep_custom_args to it)

ERROR ~ Workflow `NFCORE_SAREK:ANNOTATION_CACHE_INITIALISATION` declares 10 input channels but 11 were given

If adding a new parameter to this workflow is ok, I guess we need to change the declarations to 11?
@maxulysse

Nevermind, I forget one line of code.

@maxulysse maxulysse merged commit ff9a474 into nf-core:dev Jun 12, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants