Skip to content

Commit

Permalink
Merge pull request #674 from muabnezor/move_lr_qc
Browse files Browse the repository at this point in the history
Move long read preprocessing into a subworkflow
  • Loading branch information
jfy133 authored Oct 11, 2024
2 parents 1d3eff9 + 7c7e954 commit db33efd
Show file tree
Hide file tree
Showing 23 changed files with 720 additions and 93 deletions.
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Added`

- [#674](https://github.com/nf-core/mag/pull/674) - Added `--longread_adaptertrimming_tool` Where user can chose between porechop_abi (default) and porechop (added by @muabnezor)

### `Changed`

- [#674](https://github.com/nf-core/mag/pull/674) - Changed to porechop-abi as default adapter trimming tool for long reads. User can still use porechop if preferred (added by @muabnezor)

### `Fixed`

- [#674](https://github.com/nf-core/mag/pull/674) - Make longread preprocessing a subworkflow (added by @muabnezor)
- [#674](https://github.com/nf-core/mag/pull/674) - Add porechop and filtlong logs to multiqc (added by @muabnezor)
- [#674](https://github.com/nf-core/mag/pull/674) - Change local filtlong module to the official nf-core/filtlong module (added by @muabnezor)

### `Dependencies`

| Tool | Previous version | New version |
| ------------ | ---------------- | ----------- |
| Porechop_ABI | | 0.5.0 |
| Filtlong | 0.2.0 | 0.2.1 |

### `Deprecated`

## 3.1.0 [2024-10-04]
Expand Down
2 changes: 2 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,8 @@
- [Porechop](https://github.com/rrwick/Porechop)

- [Porechop-abi](https://github.com/bonsai-team/Porechop_ABI)

- [Prodigal](https://pubmed.ncbi.nlm.nih.gov/20211023/)

> Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010 Mar 8;11:119. doi: 10.1186/1471-2105-11-119. PMID: 20211023; PMCID: PMC2848648.
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ Other code contributors include:
- [Jim Downie](https://github.com/prototaxites)
- [Phil Palmer](https://github.com/PhilPalmer)
- [@willros](https://github.com/willros)
- [Adam Rosenbaum](https://github.com/muabnezor)

Long read processing was inspired by [caspargross/HybridAssembly](https://github.com/caspargross/HybridAssembly) written by Caspar Gross [@caspargross](https://github.com/caspargross)

Expand Down
6 changes: 6 additions & 0 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ run_modules:
- quast
- kraken
- prokka
- porechop
- filtlong

## Module order
top_modules:
Expand All @@ -35,6 +37,7 @@ top_modules:
- "fastp"
- "adapterRemoval"
- "porechop"
- "filtlong"
- "fastqc":
name: "FastQC: after preprocessing"
info: "After trimming and, if requested, contamination removal."
Expand Down Expand Up @@ -109,6 +112,9 @@ sp:
fn_re: ".*[kraken2|centrifuge].*report.txt"
quast:
fn_re: "report.*.tsv"
filtlong:
num_lines: 20
fn_re: ".*_filtlong.log"

## File name cleaning
extra_fn_clean_exts:
Expand Down
32 changes: 24 additions & 8 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -171,20 +171,36 @@ process {
publishDir = [
path: { "${params.outdir}/QC_longreads/porechop" },
mode: params.publish_dir_mode,
pattern: "*_trimmed.fastq",
pattern: "*_porechop_trimmed.fastq.gz",
enabled: params.save_porechop_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_trimmed" }
ext.prefix = { "${meta.id}_run${meta.run}_porechop_trimmed" }
}

withName: PORECHOP_ABI {
publishDir = [
path: { "${params.outdir}/QC_longreads/porechop" },
mode: params.publish_dir_mode,
pattern: "*_porechop-abi_trimmed.fastq.gz",
enabled: params.save_porechop_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_porechop-abi_trimmed" }
}

withName: FILTLONG {
ext.args = [
"--min_length ${params.longreads_min_length}",
"--keep_percent ${params.longreads_keep_percent}",
"--trim",
"--length_weight ${params.longreads_length_weight}"
].join(' ').trim()
publishDir = [
path: { "${params.outdir}/QC_longreads/Filtlong" },
mode: params.publish_dir_mode,
pattern: "*_lr_filtlong.fastq.gz",
enabled: params.save_filtlong_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_lengthfiltered" }
path: { "${params.outdir}/QC_longreads/Filtlong" },
mode: params.publish_dir_mode,
pattern: "*_filtlong.fastq.gz",
enabled: params.save_filtlong_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_filtlong" }
}

withName: NANOLYSE {
Expand Down
13 changes: 13 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,19 @@ The pipeline uses Nanolyse to map the reads against the Lambda phage and removes

The pipeline uses filtlong and porechop to perform quality control of the long reads that are eventually provided with the TSV input file.

<details markdown="1">
<summary>Output files</summary>

- `QC_longreads/porechop/`
- `[sample]_[run]_porechop_trimmed.fastq.gz`: If `--longread_adaptertrimming_tool 'porechop'`, the adapter trimmed FASTQ files from porechop
- `[sample]_[run]_porechop-abi_trimmed.fastq.gz`: If `--longread_adaptertrimming_tool 'porechop_abi'`, the adapter trimmed FASTQ files from porechop_ABI
- `QC_longreads/filtlong/`
- `[sample]_[run]_filtlong.fastq.gz`: The length and quality filtered reads in FASTQ from Filtlong

</details>

Trimmed and filtered FASTQ output directories and files will only exist if `--save_porechop_reads` and/or `--save_filtlong_reads` (respectively) are provided to the run command .

No direct host read removal is performed for long reads.
However, since within this pipeline filtlong uses a read quality based on k-mer matches to the already filtered short reads, reads not overlapping those short reads might be discarded.
The lower the parameter `--longreads_length_weight`, the higher the impact of the read qualities for filtering.
Expand Down
10 changes: 10 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,11 @@
"git_sha": "285a50500f9e02578d90b3ce6382ea3c30216acd",
"installed_by": ["modules"]
},
"filtlong": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["modules"]
},
"freebayes": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
Expand Down Expand Up @@ -202,6 +207,11 @@
"git_sha": "3135090b46f308a260fc9d5991d7d2f9c0785309",
"installed_by": ["modules"]
},
"porechop/abi": {
"branch": "master",
"git_sha": "06c8865e36741e05ad32ef70ab3fac127486af48",
"installed_by": ["modules"]
},
"porechop/porechop": {
"branch": "master",
"git_sha": "1d68c7f248d1a480c5959548a9234602b771199e",
Expand Down
33 changes: 0 additions & 33 deletions modules/local/filtlong.nf

This file was deleted.

5 changes: 5 additions & 0 deletions modules/nf-core/filtlong/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions modules/nf-core/filtlong/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

65 changes: 65 additions & 0 deletions modules/nf-core/filtlong/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit db33efd

Please sign in to comment.