Skip to content

Experiencing very slow run-time #31

@tcashby

Description

@tcashby

The paper describing PsiCLASS indicated that the software should run within 1-5 minutes per sample. However, I'm experiencing a much longer runtime (several hours) when processing human plasma cell data aligned to hg38 using HISAT2. It has been running for 15 hours as of this post and still hasn't finished the raw_splice portion.

The code used was:

psiclass --lb bam_list.txt -p 24

As an initial test, bam_list.txt only contains five bam files. The sequencing data is from human plasma cells aligned to hg38 with HISAT2. It appears that the software runs quickly until it encounters the IGK region on chromosome 2. I suspect that regions of high complexity, such as the Immunoglobulin regions in plasma cells, might be causing the extended runtime.

Questions:

  1. Is there a known issue of slowdown around complex regions?
  2. Are there any settings, optimizations, or preprocessing steps I can take to improve the runtime?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions