dynast count very slow - may not be parallelized? 

Hi Xiaojie,

Thank you for making the very useful dynast pipeline. We use it a lot in our analysis. One issue I have running into is that dynast count is very slow. I am using `dynast v1.01` and for a standard deep sequencing run (NextSeq, Novaseq, etc.) a library will take 4-5 days (96+ hours) to complete. I am wondering if it is somehow related to not using the number of threads specified (I specify `-t 8-16`). When I check my running processes using `ps -u`, I sometimes see that there is only 1 process running for `dynast count` when there were the number `t` specified for `dynast consensus`, which runs much faster (<8 hours). 

Here is a standard workflow that I use:

```
dynast align -i $star_index_directory -o $align_directory -x 10xv3 $fastq_seq_file $fastq_umi_file -w $whitelist --STAR-overrides '--soloFeatures Gene GeneFull --limitBAMsortRAM 60000000000'
dynast consensus -t 16 -g $gtf_file --barcode-tag CB --umi-tag UB -o consensus Aligned.sortedByCoord.out.bam
dynast count -t 16 -g $gtf_file --barcode-tag CB --umi-tag UB -o count_with_consensus --barcodes Solo.out/GeneFull/raw/barcodes.tsv --conversion TC consensus/consensus.bam
```

Thanks for your help!
William

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dynast count very slow - may not be parallelized? #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development