-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization of compute_tide_corrections
with FES2014
for multiple lat/lons
#91
Comments
Since posting this issue, I have discovered that the
However, because I have static points with multiple timesteps at each, I think for this application (many timesteps for a smaller set of static modelling point locations), the most efficient processing flow might be something like this?
(or alternatively, perhaps some method to detect duplicate/repeated lat/lons, then batch those together to reduce the number of required interpolations...) |
@robbibt still thinking about the best way to enact these changes. One idea I've been floating is to cache the interpolation objects for each constituent so that won't have to be repeated reads. I'm worried about this being a bit memory intensive though so I need to put in some tests. |
…address #91 to do: need to add tests that outputs are as expected to do: need to see if these are actual optimizations test: switch interpolation test to soft tabs
Hey @tsutterley, am doing some further optimisations of our tide modelling code as we're moving towards a multi-tide modelling system where we choose the best tide model locally based on comparisons with our satellite data. Because of this, our modelling now takes a lot longer than previously, so I'm looking into trying to parallelise some of the underlying Our two big bottlenecks are:
For number 2, I've been able to get a big speed up by parallelising the entire I know you made some changes to address this last year when I first posted this issue, but I wanted to double check: are the newer Ideally, I'd love to do something like this:
|
Hey @robbibt, basically yes that was the plan. The new functions can completely replicate the prior functionality. The difference is that using the new read and interpolate method keeps all of the constituent data in memory. In some cases this may be slower, such as running on a small (possibly distributed) machine. So I've kept both methods. In cases where you want to run for multiple points with the same data, there is a potential speed up with the new method since (as you mentioned) there's the io bottleneck. I've thought about switching to |
First of all, congrats @tsutterley on an incredible package... such an amazing resource! I've been looking into using
pyTMD
for modelling tide heights fromFES2014
for our DEA Coastlines coastline mapping work. Essentially, our current process is to:I've been testing out the
compute_tide_corrections
as a way to achieve this, passing in the lat/lon of a given 2 x 2 km grid point, and all of the times from my satellite datasets:This works great, but it's pretty slow: about 38.4 seconds in total for a single point lat/lon. Because I can have up to 100+ lat/lon points in a given study area, this will quickly blow out if I want to apply
compute_tide_corrections
to multiple points.Using
line_profiler
, it appears that by far most of this time (e.g. 38.2 seconds, or over 99%) is taken up in theextract_FES_constants
function:Profiling
extract_FES_constants
, it seems like by far the most amount of time in that function (37.5 seconds) is taken up byread_netcdf_file
:So essentially, loading the FES2014 files with
read_netcdf_file
occupies almost all of the time taken to runcompute_tide_corrections
. For analyses involving many timesteps for a single lat/lon this isn't a problem, as the files only have to be read once. However, for analyses wherecompute_tide_corrections
needs to be called multiple times to model tides for multiple lat/lons, the FES2014 data has to be loaded again and again, leading to extremely long processing times.Instead of loading the FES2014 files with
read_netcdf_file
every timecompute_tide_corrections
is called, could it be possible to give users the option to load the FES files themselves outside of the function, and then pass in the loaded data (i.e. hc, lon, lat) directly to the function via an optional parameter? This would allow users to greatly optimise processing time for analyses that include many lat/lon tide modelling locations.The text was updated successfully, but these errors were encountered: