docs: Refactor usage docs into user-centric multi-page structure#1742
Draft
adamrtalbot wants to merge 5 commits intonf-core:devfrom
Draft
docs: Refactor usage docs into user-centric multi-page structure#1742adamrtalbot wants to merge 5 commits intonf-core:devfrom
adamrtalbot wants to merge 5 commits intonf-core:devfrom
Conversation
Restructure the monolithic usage.md (831 lines) into a Quick Start landing page with 6 focused subpages organised by user intent: - samplesheet.md: input format, examples-first - reference-genomes.md: genome configuration - alignment-and-quantification.md: analysis strategy with decision tree - preprocessing.md: trimming, rRNA removal, contamination screening - advanced-features.md: UMIs, prokaryotic, 3'DGE, GPU acceleration - configuration.md: Nextflow profiles, resources, custom config Also refactors the DE tutorial from 5 Gitpod-dependent pages into 3 self-guided pages (introduction, running the pipeline, DE in R). Applies nf-core writing style throughout: British English, active voice, no gerunds in headings, no please/e.g./i.e. output.md is unchanged. Closes nf-core#1737 Generated by Claude Code
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.5.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
- usage.md: use descriptive filenames, add reference genome guidance, add check results section - output.md: add key outputs summary table at top, fix style violations (please, e.g., i.e.) - samplesheet.md: fix e.g. to such as, fix behavior to behaviour (British English) - configuration.md: fix 2 broken anchor links, fix params.yaml iGenomes example, rewrite iGenomes note - alignment-and-quantification.md: remove duplicated HISAT2 content from quantification section - de-analysis-in-r.md: fix DESeq2 accessor bugs (dds$counts, res$gene, resSig$gene) Generated by Claude Code
Fix 27 issues across 7 documentation files: - Fix 4 broken links (RSeQC, TOC, cross-page anchor, internal anchor) - Fix 9 typos in output.md (gauge, transcripts, abundances, etc.) - Convert GitHub-flavored alerts to nf-core admonitions - Convert legacy > **NB:** notes to :::note (outside details blocks) - Move :::tip outside <details> block for correct rendering - Fix hardcoded column name in DE tutorial R code - Fix hedging language and capitalisation per nf-core style Generated by Claude Code
The customised docs/README.md lists the new multi-page usage documentation structure and intentionally differs from the template. Generated by Claude Code
Revert customised README.md to the template version. The nf-core website auto-discovers subpages via frontmatter order, so the index page only needs to link usage.md and output.md. Generated by Claude Code
adamrtalbot
commented
Mar 4, 2026
Comment on lines
+81
to
+129
| ## Strandedness prediction | ||
|
|
||
| If you set the strandedness value to `auto`, the pipeline will sub-sample the input FastQ files to 1 million reads, use Salmon Quant to automatically infer the strandedness, and then propagate this information through the rest of the pipeline. This behaviour is controlled by the `--stranded_threshold` and `--unstranded_threshold` parameters, which are set to 0.8 and 0.1 by default, respectively. This means: | ||
|
|
||
| - **Forward stranded:** At least 80% of the fragments are in the 'forward' orientation. | ||
| - **Unstranded:** The forward and reverse fractions differ by less than 10%. | ||
| - **Undetermined:** Samples that do not meet either criterion, possibly indicating issues such as genomic DNA contamination. | ||
|
|
||
| :::note | ||
| These thresholds apply to both the strandedness inferred from Salmon outputs for input to the pipeline and how strandedness is inferred from RSeQC results using pipeline outputs. | ||
| ::: | ||
|
|
||
| ### Usage examples | ||
|
|
||
| 1. **Forward Stranded Sample:** | ||
| - Forward fraction: 0.85 | ||
| - Reverse fraction: 0.15 | ||
| - **Classification:** Forward stranded | ||
|
|
||
| 2. **Reverse Stranded Sample:** | ||
| - Forward fraction: 0.1 | ||
| - Reverse fraction: 0.9 | ||
| - **Classification:** Reverse stranded | ||
|
|
||
| 3. **Unstranded Sample:** | ||
| - Forward fraction: 0.45 | ||
| - Reverse fraction: 0.55 | ||
| - **Classification:** Unstranded | ||
|
|
||
| 4. **Undetermined Sample:** | ||
| - Forward fraction: 0.6 | ||
| - Reverse fraction: 0.4 | ||
| - **Classification:** Undetermined | ||
|
|
||
| You can control the stringency of this behaviour with `--stranded_threshold` and `--unstranded_threshold`. | ||
|
|
||
| ### Errors and reporting | ||
|
|
||
| The results of strandedness inference are displayed in the MultiQC report under 'Strandedness Checks'. This shows any provided strandedness and the results inferred by both Salmon (when strandedness is set to 'auto') and RSeQC. Mismatches between input strandedness (explicitly provided by the user or inferred by Salmon) and output strandedness from RSeQC are marked as fails. For example, if a user specifies 'forward' as strandedness for a library that is actually reverse stranded, this is marked as a fail. | ||
|
|
||
|  | ||
|
|
||
| Be sure to check the strandedness report when reviewing the QC for your samples. | ||
|
|
||
| ## Linting | ||
|
|
||
| By default, the pipeline will run [fq lint](https://github.com/stjude-rust-labs/fq) on all input FASTQ files, both at the start of preprocessing and after each preprocessing step that manipulates FASTQ files. If errors are found, an error will be reported and the workflow will stop. | ||
|
|
||
| The `extra_fqlint_args` parameter can be manipulated to disable [any validator](https://github.com/stjude-rust-labs/fq?tab=readme-ov-file#validators) from `fq` you wish. For example, we have found that checks on the names of paired reads are prone to failure, so that check is disabled by default (setting `extra_fqlint_args` to `--disable-validator P001`). |
Contributor
Author
There was a problem hiding this comment.
This should be migrated to a specific section away from the general usage docs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Restructures the monolithic
docs/usage.md(831 lines) into a Quick Start landing page with 6 focused subpages, organised by user intent rather than pipeline code structure.Closes #1737
Changes
New structure
Key design principles
DE tutorial refactoring
/workspace/gitpod/...Writing style
Applied nf-core writing style throughout:
Content verification
output.mdunchangednf-core format compatibility
Additional pages in
docs/usage/are explicitly supported by nf-core guidelines:This pattern is already used by nf-core/sarek (
docs/usage/variantcalling/), nf-core/taxprofiler (docs/usage/tutorials.md), and this pipeline's existing DE tutorial.