Skip to content

Commit

Permalink
Non-transposed dense oracular extractors now directly fill the cache.
Browse files Browse the repository at this point in the history
This uses a shared memory pool where the existing slabs are defragmented
to make a contiguous allocation of available cache memory. This is used
in a single call to the HDF5 library with hyperslab unions for the new
slabs to be populated. The aim is to reduce the overhead from separate
calls to the HDF5 library, a la the primary sparse extractors.

Note that this optimization is only available when the target dimension
corresponds to HDF5 rows. If transposition is required, we do separate
calls for each slab, we can't afford to hold all slabs in a separate
buffer before performing the transposition into the cache pool.

As a result, the oracular DenseMatrix extractors are split into two
subclasses depending on whether the extraction requires transposition.
We do the same to all other DenseMatrix extractors for consistency.
  • Loading branch information
LTLA committed Sep 25, 2024
1 parent 292a202 commit ef4d3fd
Showing 1 changed file with 226 additions and 73 deletions.
Loading

0 comments on commit ef4d3fd

Please sign in to comment.