Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input can be a path prefix #24

Merged
merged 11 commits into from
Nov 27, 2022
Merged

Conversation

nh13
Copy link
Collaborator

@nh13 nh13 commented Nov 15, 2022

The user may provide a path prefix (or directory) in which undetermined FASTQs live. The file names are assumed to be of the form: <*>_L00#_<R# or I#>_001.fastq.gz. The following can be inferred from this pattern: the lane number, read number, and if the FASTQ is a template or index read.

If no read structure is given either on the command line or Sample Sheet, then a new list of read structure is built, one per input FASTQ, where the kind (segment type) is inferred from the file name and all bases are used. E.g.

test_L001_I1.fastq.gz
test_L001_R1.fastq.gz
test_L001_R2.fastq.gz
test_L001_I2.fastq.gz

would yield the read structures +B +T +T +B. Reads will all molecular barcodes are supported (e.g. test_L001_U1.fastq.gz) and bases to skip (e.g. test_L001_S1.fastq.gz). The utility of the latter is questionable, but kept for completeness.

Improvement: due to the need for all sample barcode read segments to have a fixed length (for index hopping metrics collection and some validation), the first read is read from each FASTQ that contains a variable length sample barcode to determine the read length and fix the length of the variable length sample barcode. I don't believe that a read structure like +B +T +T +B previously worked!

@nh13 nh13 changed the base branch from main to feature/sample-sheet-output-to-project November 15, 2022 06:49
src/lib/opts.rs Show resolved Hide resolved
src/lib/run.rs Outdated Show resolved Hide resolved
@nh13 nh13 marked this pull request as ready for review November 15, 2022 18:17
@nh13 nh13 requested a review from tfenne November 15, 2022 18:17
@nh13 nh13 force-pushed the feature/sample-sheet-output-to-project branch from 5fc25d3 to c36929a Compare November 15, 2022 18:50
@nh13 nh13 force-pushed the feature/path-prefix-as-input-infer-kind branch from d5d9495 to 639692c Compare November 15, 2022 18:52
Base automatically changed from feature/sample-sheet-output-to-project to main November 15, 2022 18:57
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
src/lib/opts.rs Outdated Show resolved Hide resolved
src/lib/opts.rs Outdated Show resolved Hide resolved
src/lib/opts.rs Outdated Show resolved Hide resolved
src/lib/utils.rs Outdated Show resolved Hide resolved
src/lib/utils.rs Outdated Show resolved Hide resolved
src/lib/utils.rs Outdated Show resolved Hide resolved
src/lib/utils.rs Outdated Show resolved Hide resolved
src/lib/utils.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@tfenne tfenne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments and suggestions, but good to merge once you're happy.

Cargo.toml Show resolved Hide resolved
src/lib/opts.rs Outdated Show resolved Hide resolved
src/lib/opts.rs Outdated Show resolved Hide resolved
src/lib/run.rs Show resolved Hide resolved
src/lib/run.rs Show resolved Hide resolved
src/lib/run.rs Outdated Show resolved Hide resolved
@nh13 nh13 force-pushed the feature/path-prefix-as-input-infer-kind branch from 372ca33 to 7542c8d Compare November 17, 2022 16:22
@nh13 nh13 requested a review from omicsorama November 21, 2022 22:50
@nh13 nh13 merged commit ddc5928 into main Nov 27, 2022
@nh13 nh13 deleted the feature/path-prefix-as-input-infer-kind branch November 27, 2022 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants