Skip to content

Conversation

LorienLV
Copy link

@LorienLV LorienLV commented Nov 7, 2023

This pull request aims to enhance the compatibility and performance of GenomicsBench on Arm architectures. It introduces several noteworthy changes to these kernels:

  • BSW: Added SVE intrinsics version as an alternative to SSE, AVX2, and AVX-512.
  • CHAIN: This kernel is no longer linked to Minimap2, as the dependency is no required. This change ensures compatibility on Arm machines.
  • FAST-CHAIN: This is a version of CHAIN that removes all the heuristics to vectorize the kernel. It includes a x86-intrinsics version (SSE, AVX2, AVX512), and an SVE-intrinsics version. The x86 version of FAST-CHAIN uses 32-bit elements, which can be insufficient for some inputs. A 64-bit version of the kernel is also provided.
  • KMER-CNT: We have modified the kernel to allow better performance and parallel scalability at the cost of 2x more memory utilization. This version is not only faster in Arm but also in x86. The high-memory version can be enabled by using the --highmem parameter when executing the kernel.
  • NN-VARIANT: Ported the kernel from Clair (Tensorflow) to Clair3 (Tensorflow2) due to the predominance of Arm-optimized Tensorflow v2. Migrating to Clair3 means that the inputs and the model of the kernel have also changed. The command line added to the execution scripts uses Oxford Nanopore r941 prom hac g360+g422 pre-trained model, chromosome 20 of HG002 from NITS’s Genome in a Bottle (GIAB) project, and the following input regions: region_for_small_input.txt and region_for_lage_input.txt.
  • WFA: This is a new kernel that implements the Wavefront Alignment Algorithm (WFA). It uses the input format as BSW and applies the same multithreading scheduling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant