- [x] Pointer aliasing - [ ] Optimize memory access - [x] Block size tuning - [ ] Intrinsics - [x] Particles AoS to SoA - [ ] Grid Cells AoS to SoA - [ ] Remove calls to `cudaDeviceSynchronize` when not necessary