wfmash step to speed up

### Description of feature

Dear nf-core & pangenome team,

I have a few questions about your great program.

Based on the link (https://github.com/nf-core/pangenome/blob/1.0.0/modules/nf-core/wfmash/main.nf), it appears that wfmash performs all-vs-all alignment on a single node. 

    wfmash \\
        ${fasta_gz} \\
        $query \\
        $query_list \\
        --threads $task.cpus \\
        $paf_mappings \\
        $args > ${prefix}.paf

From my trials, this is indeed the case.

I am trying to speed up the wfmash process on multiple nodes (PBSpro) by running parallel jobs. My idea is to perform one-vs-all alignments for each node from an input full genome dataset (120 human pangenomes), and then merge the results into a single paf file for further analysis.

1. Do you have any recommendations for tweaking the wfmash code to achieve this?
2. If I run one-vs-all alignments on each node, will the merged paf file be equivalent to an all-vs-all alignment? Theoretically, I assume the final outcome should be the same.

Looking forward to your insights.

Kind regards,

Taek


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wfmash step to speed up #205

OZTaekOppa
openedon Aug 1, 2024

Description of feature

Assignees

Labels

Type

Projects

Milestone

Relationships

Development