-
Notifications
You must be signed in to change notification settings - Fork 196
Description
Description of the bug
We tried to run the scrnaseq workkflow on some Drop-seq data, but it crashed with this error:
gzip: invalid magic
Looking at this a bit more closely, this was the command that caused it:
# run simpleaf quant
gzip -dcf > whitelist.uncompressed.txt
scrnaseq/modules/local/simpleaf_quant.nf
Line 57 in 61d1919
# run simpleaf quant |
Before running SIMPLEAF_QUANT
, it attempts to uncompress a non-existent file. In other words, for any method that doesn't provide a whitelist of possible barcodes (such as Drop-seq), as it stands the scrnaseq
workflow will fail by design.
if (params.barcode_whitelist) {
ch_barcode_whitelist = file(params.barcode_whitelist)
} else if (params.protocol.contains("10X")) {
ch_barcode_whitelist = file("$baseDir/assets/whitelist/10x_${chemistry}_barcode_whitelist.txt.gz", checkIfExists: true)
} else {
ch_barcode_whitelist = [] // THIS LOGIC NEEDS FIXING
}
scrnaseq/workflows/scrnaseq.nf
Line 82 in 61d1919
if (params.barcode_whitelist) { |
TEMPORARY REMEDY
As the gzip
command is hard-coded into the script block (see above), the only way to get it to not fail is by staging a gzip compressed file via the whitelist option (I uploaded an empty file to S3):
gzip -dcf empty_gzip_file.txt.gz > whitelist.uncompressed.txt
One can then get simpleaf_quant
to infer the confident barcodes, e.g. via the --knee
method (thanks to @rob-p for advice!). This will then skip adding the (non-existent) whitelist to the simpleaf_quant
command, achieved here:
scrnaseq/modules/local/simpleaf_quant.nf
Line 34 in 61d1919
// check if users are using one of the mutually excludable parameters: |
Lastly, I had to pass the following external arguments to simpleaf_quant
to use the knee argument as well as a resolution method how near-duplicate UMIs are resolved.
(quoting Rob:
The
cr-like
method is a safe default I think. That is, it’s not just meant for chromium chemistries, but is a general algorithm. In general, I think the specific method by which similar UMIs should be allocated is still an active area of research.)
Adding the following to the Nextflow config allowed the job to complete successfully:
withName:'SIMPLEAF_QUANT'{
ext.args = "--knee -r cr-like"
}
Thanks, Felix
Command used and terminal output
No response
Relevant files
No response
System information
No response