Feature/morton hashing #2158

fluidnumerics-joe · 2025-08-23T19:16:53Z

[ X ] Chose the correct base branch (main for v3 changes, v4-dev for v4 changes)
[ X ] Fixes Spatial hashing and curvilinear search performance improvements #2144

Overview

This PR brings in a fairly substantial change to the spatial hashing. In experimenting with the spatial hashing in the MOi benchmark, we found that the spatial hashing spent a large amount of time in initialization. Further, the search times were not providing an improvement over the naive implementation for searching in curvilinear grids. This was found to be related to a few key issues

The chosen hash function that relates hash keys to a short list of faces/cells in the curvilinear grid produced long searches in areas where the curvilinear grid approached the poles (e.g. over canada and siberia in the tripolar grid)
The list of lists structure for the hash table required a large number of append operations (within nested for loops) during hash table creation. Attempts to pre-allocate and use a CSR format for the hash table are hampered by point (1); a large number of overlaps of curvilinear cells with the hash grid resulted in large allocations that are prohibitive
Querying was a challenge to vectorize with the list of lists hash table and computation of barycentric coordinates for particle in cell checking was found to be expensive

This PR addresses these issues by changing the hashing strategy. The "hash function" is a function that takes in a position on the grid and returns an integer value that has some correlation with spatial locality. In other words the hash function of two nearby points in space, should return integer values that are close to each other in value.

Morton encoding for hash table construction

Here, we opt to use Morton encoding of three floating point values to compute the hash of a position. In spherical meshes, the latitude and longitude points are converted to x,y,z locations on the unit sphere. On "flat" meshes, z is fixed to zero and x and y are taken as is, with no conversions. The hash table is constructed by relating the (j,i) indices of the curvilinear grid to the morton code of the centroid of each curvilinear cell. The hash table is quick to calculate and is simply stored as a compressed sparse row (CSR) matrix and sorted in order of increasing morton number. This last step (sorting) adds some cost to the initialization but saves us considerably in the querying stage since np.searchsorted can be used (instead of np.search).

A comparison of the construction time for the MOi benchmark is shown below for v4-dev and the new proposed morton encoding spatial hash function.

Morton encoding for particle position queries

With the hash table as a CSR matrix, it's a bit easier to vectorize queries over a large number of particles. Essentially, the morton code of all particle positions are calculated and searched (using np.searchsorted) against the hash table. This returns, for each particle, a short list of (j,i) indices to check for each particle. We then perform distance minimization between each particle and it list of curvilinear cells where the distance is measured from the curvilinear cell centroid. This avoids the need for barycentric coordinate calculation. At the end, the method returns, as before, a list of (j,i) indices, one for each particle position, where the particle is found.

Below is a figure comparing the original v4-dev naive search, with the previous spatial hashing search, and the new morton spatial hashing for the MOi benchmark across a range of particle counts. Note that that new morton hashing search times include the spatial hash reconstruction, which is minimal (~2s on system76 Lemur with NVMe drives)

The above figure shows improved scaling in the morton hash based search functionality. However, the 'v4 search indices' and 'v4 hash table' runtimes were generated on @erikvansebille 's work-station and the 'v4 morton' runtimes were on mine, making direct comparison challenging.

Below shows the spatial hashing search (what's currently on v4-dev) and the propose "v4 morton " implementation (x-axis is log scale).

erikvansebille

This is fantastic, @fluidnumerics-joe! The performance improvements are amazing!! For 1M particles, I now get a 2 seconds time, instead of the 600 seconds with the spatial hash map 🤩.

What a great leap towards a fast v4!

Is this implementation also useful for unstructured grids?

parcels/spatialhash.py

Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>

fluidnumerics-joe · 2025-08-25T11:19:31Z

Is this implementation also useful for unstructured grids?

I want to follow up on this after this pr. I suspect yes. This SpatialHash class just requires face centroids which are easy to get from uxarray.

The query method as it is written right now works with j,i indices. This would need to be adapted to work with single dimension face indices. It shouldn't be too hard to do

erikvansebille · 2025-08-25T11:42:37Z

I want to follow up on this after this pr. I suspect yes.

Yes, fully agree that this should be a separate PR

Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>

fluidnumerics-joe added 4 commits August 21, 2025 17:00

First draft of morton-spatial-hashing

d568fda

Add search for minimum centroid distance for j,i search

e0d3c36

Relax exact match on morton code

59df08a

Fix bug in vectorized spatialhash search

6838ae9

fluidnumerics-joe requested review from VeckoTheGecko and erikvansebille August 23, 2025 19:16

github-project-automation bot added this to Parcels development Aug 23, 2025

github-project-automation bot moved this to Backlog in Parcels development Aug 23, 2025

fluidnumerics-joe added 2 commits August 23, 2025 17:17

Merge remote-tracking branch 'origin/v4-dev' into feature/morton-hashing

ca426cd

Fix reference to _source_grid._mesh attribute

f53003a

erikvansebille reviewed Aug 25, 2025

View reviewed changes

parcels/spatialhash.py Show resolved Hide resolved

parcels/spatialhash.py Show resolved Hide resolved

parcels/spatialhash.py Outdated Show resolved Hide resolved

parcels/spatialhash.py Outdated Show resolved Hide resolved

parcels/spatialhash.py Outdated Show resolved Hide resolved

erikvansebille mentioned this pull request Aug 25, 2025

Adding script for benchmarking the curvilinear search function Parcels-code/parcels-benchmarks#5

Draft

fluidnumerics-joe and others added 2 commits August 25, 2025 07:14

Update parcels/spatialhash.py

e604f6f

Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>

Update parcels/spatialhash.py

190c5e1

Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>

erikvansebille approved these changes Aug 25, 2025

View reviewed changes

fluidnumerics-joe and others added 2 commits August 25, 2025 10:08

Add comment for zmin/zmax setting for flat mesh

06211ab

Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>

Fix docstrings for morton code and dilate_bits

46ce99e

fluidnumerics-joe mentioned this pull request Aug 25, 2025

Morton hashing for unstructured grids #2160

Closed

fluidnumerics-joe merged commit 1c638c8 into v4-dev Aug 25, 2025
9 checks passed

fluidnumerics-joe deleted the feature/morton-hashing branch August 25, 2025 14:19

github-project-automation bot moved this from Backlog to Done in Parcels development Aug 25, 2025

This was referenced Aug 25, 2025

Implement fieldset._load_timesteps() attempt 2 #2161

Closed

Benchmark MOi curvilinear Parcels-code/parcels-benchmarks#7

Merged

Bring back particle.ei in v4 #2125

Closed

Replace curvilinear grid search with spatial hashing #2119

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/morton hashing #2158

Feature/morton hashing #2158

Uh oh!

fluidnumerics-joe commented Aug 23, 2025

Uh oh!

erikvansebille left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fluidnumerics-joe commented Aug 25, 2025

Uh oh!

erikvansebille commented Aug 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Feature/morton hashing #2158

Feature/morton hashing #2158

Uh oh!

Conversation

fluidnumerics-joe commented Aug 23, 2025

Overview

Morton encoding for hash table construction

Morton encoding for particle position queries

Uh oh!

erikvansebille left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fluidnumerics-joe commented Aug 25, 2025

Uh oh!

erikvansebille commented Aug 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants