Skip to content

Fix whitelist logic for Drop-seq #263

@FelixKrueger

Description

@FelixKrueger

Description of the bug

We tried to run the scrnaseq workkflow on some Drop-seq data, but it crashed with this error:

gzip: invalid magic

Looking at this a bit more closely, this was the command that caused it:

# run simpleaf quant
gzip -dcf  > whitelist.uncompressed.txt

# run simpleaf quant

Before running SIMPLEAF_QUANT, it attempts to uncompress a non-existent file. In other words, for any method that doesn't provide a whitelist of possible barcodes (such as Drop-seq), as it stands the scrnaseq workflow will fail by design.

if (params.barcode_whitelist) {
    ch_barcode_whitelist = file(params.barcode_whitelist)
} else if (params.protocol.contains("10X")) {
    ch_barcode_whitelist = file("$baseDir/assets/whitelist/10x_${chemistry}_barcode_whitelist.txt.gz", checkIfExists: true)
} else {
    ch_barcode_whitelist = [] // THIS LOGIC NEEDS FIXING
}

if (params.barcode_whitelist) {

TEMPORARY REMEDY

As the gzip command is hard-coded into the script block (see above), the only way to get it to not fail is by staging a gzip compressed file via the whitelist option (I uploaded an empty file to S3):

gzip -dcf empty_gzip_file.txt.gz > whitelist.uncompressed.txt

One can then get simpleaf_quant to infer the confident barcodes, e.g. via the --knee method (thanks to @rob-p for advice!). This will then skip adding the (non-existent) whitelist to the simpleaf_quant command, achieved here:

// check if users are using one of the mutually excludable parameters:

Lastly, I had to pass the following external arguments to simpleaf_quant to use the knee argument as well as a resolution method how near-duplicate UMIs are resolved.
(quoting Rob:

The cr-like method is a safe default I think. That is, it’s not just meant for chromium chemistries, but is a general algorithm. In general, I think the specific method by which similar UMIs should be allocated is still an active area of research.)

Adding the following to the Nextflow config allowed the job to complete successfully:

withName:'SIMPLEAF_QUANT'{
    ext.args = "--knee -r cr-like"
}

Thanks, Felix

Command used and terminal output

No response

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions