-
Notifications
You must be signed in to change notification settings - Fork 215
Description
1. What were you trying to do?
I was trying to map paired-end reads from .fastp.fastq.gz files to a tomato pangenome graph using vg giraffe.
Command structure (called inside a Bash job submitted via bsub):
vg giraffe -t 48 -Z graph.gbz -d graph.dist -m graph.min \
-f sample_R1.fastp.fastq.gz -f sample_R2.fastp.fastq.gz \
-N SAMPLE -R SAMPLE
2. What did you want to happen?
I expected vg giraffe to process the reads and output a .gam file with the alignments.
3. What actually happened?
In many jobs, the process stalls indefinitely. Symptoms include:
- .gam file remains empty (0 bytes) even after 1+ hour.
- The vg giraffe process appears in ps output with status Dl, indicating uninterruptible I/O wait.
- In some cases, vg giraffe hangs immediately after start; in others, it starts writing but then stops progressing.
- Manual inspection with lsof confirms that both FASTQ files are open.
Files are on a GPFS (IBM Spectrum Scale) filesystem, but manual read with zcat is instantaneous.
4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:
No crash or stack trace is produced, only hanging.
5. What data and command can the vg dev team use to make the problem happen?
While I cannot share the full tomato data due to size/privacy, the problem appears reproducible when:
- Reading gzipped .fastq.gz files > 5 GB each (paired-end).
- Files are stored on a GPFS filesystem.
- The command is run with high thread count (e.g. -t 48).
- The command is launched inside a batch job (e.g. via bsub).
6. What does running vg version say?
v> vg version
vg [warning]: System's vm.overcommit_memory setting is 2 (never overcommit). vg does not work well under these conditions; you may appear to run out of memory with plenty of memory left. Attempting to unsafely reconfigure jemalloc to deal better with this situation.
vg version v1.66.0 "Navetta"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Using HTSlib headers 101990, library 1.19.1-29-g3cfe8769
Built by fokamoto@mustard