Skip to content

wfmash step to speed up #205

Open
Open

Description

Description of feature

Dear nf-core & pangenome team,

I have a few questions about your great program.

Based on the link (https://github.com/nf-core/pangenome/blob/1.0.0/modules/nf-core/wfmash/main.nf), it appears that wfmash performs all-vs-all alignment on a single node.

wfmash \\
    ${fasta_gz} \\
    $query \\
    $query_list \\
    --threads $task.cpus \\
    $paf_mappings \\
    $args > ${prefix}.paf

From my trials, this is indeed the case.

I am trying to speed up the wfmash process on multiple nodes (PBSpro) by running parallel jobs. My idea is to perform one-vs-all alignments for each node from an input full genome dataset (120 human pangenomes), and then merge the results into a single paf file for further analysis.

  1. Do you have any recommendations for tweaking the wfmash code to achieve this?
  2. If I run one-vs-all alignments on each node, will the merged paf file be equivalent to an all-vs-all alignment? Theoretically, I assume the final outcome should be the same.

Looking forward to your insights.

Kind regards,

Taek

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovement for existing functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions