Using numba for compiling kernels instead of JIT

Performance in Parcels, although not being the primary selling point, is very important. In v3 of Parcels this was achieved via "JIT mode", which took user provided kernels (i.e., functions) and used a custom code converter which analysed the Python AST and converted it down to C.

This approach had limitations on the kernels that could be written (documented at https://docs.oceanparcels.org/en/latest/examples/tutorial_parcels_structure.html#3.-Kernels).


Numba has been previously investigated ( https://github.com/OceanParcels/Parcels/pull/1135 with findings at [numba_integration:numba/README.md](https://github.com/OceanParcels/Parcels/blob/afff754730f3f81620d34b0a8204b1f6277e7084/parcels/numba/README.md) ), and found:

1. not to be as fast as the JIT mode
2. potentially unstable (as the `jitclass` that was relied on was (and still is) marked as experimental without a feature complete API or roadmap for future development)


From what I know, I don't think that these concerns are blocking:
1. I'm not sure if profiling was done here to locate the exact source of the slowdown. This might be something we can optimise (along with other parts of Parcels) as part of our more benchmark focused development cycle we're aiming for.
2. I don't think that its important for us to use `jitclass`. We can pass all the state that the kernel needs to execute along via the `fieldset` struct. There isn't additional state that needs to be passed with `jitclass`. From a code organisation POV this would be separate (numba compiled or C) functions instead of methods on the FieldSet/Field/Grid classes.


In order to make kernels numba compatible, the following need to be done:

- [x] #2107 
- [ ] Make kernels explicitly return status codes (as optional return types are not supported in numba)
- [ ] Remove any "code rewriting" that was done in `codegenerator.py` (I can only see `particle.delete()` being rewritten by the compiler, but there might be others). This will bring the code that users write closer to the code that is executed under the hood
- [ ] #2109
- [x] #2108
- [ ] Stabilize the structures that kernels have access to
    - in v3 there is a gap between Scipy and JIT mode (where in scipy mode the Python Fieldset object is provided, while in JIT a C structure that mimics the Python Fieldset object is provided). Using numba, it would be good to consolidate these data structures (i.e., using the structures available in JIT as a starting point - though I think this fully encompasses what we need. This would be compatible with numba compiled kernels, as well as kernels that aren't compiled with numba).


Open questions:
- Bringing numba into Parcels will result in a distinct interface between the Python side of things (fieldset loading, grid defining, particleset defining, particle output writing), and the numba side of things (running of the kernels - interpolation, index searching, and other things that were previously handled in git mode).
  - What does the interface between the Python code and numba look like? How can be smartly load fields that reduces memory footprint and is also in line with xarray? Would be good to get input from some xarray folks here.
- How does this change the writing of interpolators? Perhaps there's two parts to an interpolator - `.assert_is_compatible()` which checks whether the Field is compatible, and a jit compiled function `.interpolate()` which is then passed down to the kernel and is used to actually perform the interpolation. 


Keen to hear your thoughts @RoelBrouwer, @qubixes, and @erikvansebille to see if there's anything I'm missing here. Exciting that the internals of Parcels changed to a point making it easier to work on this stuff!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using numba for compiling kernels instead of JIT #2106

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Using numba for compiling kernels instead of JIT #2106

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions