-
Notifications
You must be signed in to change notification settings - Fork 817
Description
Description of the bug
Ran into an error with the summarized experiment process
Process `NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_TRANSCRIPT (all_samples)`
Error message is from R:
Error in findColumnWithAllEntries(ids, metadata) :
No column contains all vector entries
Tracked it down to the parse_metadata function in the R script.
metadata_id_col <- findColumnWithAllEntries(ids, metadata)
I had used hyphens in my sample names, but the ids passed to findColumnWithAllEntries have all the hyphens replaced with '.'
eg. "D10-D_Na-R1" becomes "D10.D_Na.R1"
Looks like this is happening with the output from salmon, the column names from the salmon.merged.transcript_counts.tsv, which are used to set the ids variable in the Rscript, have the incorrect sample names.
Easy fix to just correct the names in the sample sheet.
But it might be useful to add to another check when initially parsing the sample sheet to catch this right out of the gate.
Command used and terminal output
#!/bin/bash
#SBATCH --job-name=fashe
#SBATCH -p barc
#SBATCH -t 12:00:00
#SBATCH --mem=8G
#SBATCH -o log/rna-%j.out
#SBATCH -e log/rna-%j.err
if [ ! -d log ]; then
mkdir log
fi
module load nextflow
# using the dev branch because of gzip bug that's been fixed
nextflow run nf-core/rnaseq \
-profile unc_longleaf \
-params-file conf/rnaseq_params.yaml \
-r devRelevant files
No response
System information
Nextflow 24.04.2
HPC
Slurm
Singularity
Rhel8
nf-core/rnaseq dev branch