-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamline DeepBedMap model tuning, evaluation and Antarctic-wide DEM generation #156
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Enable more concurrent runs on 4 GPUs instead of just 2! In order to do so srgan_train.get_deepbedmap_test_result was refactored in such a way that we do not risk saving and reloading the model weights to/from the same file (that might actually clash when 2 processes are running, more so if 4 are doing so)! The best RMSE_test result achieved from the last hyperparameter tuning frenzy is 29.90 at https://www.comet.ml/weiji14/deepbedmap/fd658ce06e81492ea5a6f4b5e1afa028 but that might be severely overfitted, 2nd best achieved is 38.50 at https://www.comet.ml/weiji14/deepbedmap/abc3af8e9abc4080a6b5b44b33c537c2 which might be the one we'll actually use. This commit extends the dual-GPU functionality introduced in a2866b6. Since our GPUs are on different servers (2x Tesla V100s on tara, 2x Tesla P100s on kahutea), this relies on having GMT==6.0.0rc1 installed from conda (1edb16e) instead of compiling from source as the latter would mean GMT can only work on one server. The model weights are saved to unique temporary folders while training and are not actually loaded by get_deepbedmap_test_result (i.e. we just use the model directly since it is trained already...). Also made up a different TPE seed for each device/GPU based on len(hostname) + $CUDA_VISIBLE_DEVICE, very hacky I know. There were a lot of hyperparameter settings I've tried over the weekend on the new quilt hash 0734959aa4f4903a17ed2acdfd53b3c0c826aadfc718e5fdd3c1b04963e1206e training tiles. The final tuning frenzy involved ~25 experiments each on 4 GPUs with this configuration: residual_scaling between 0.15 and 0.30, learning_rate between 6.5e-5 and 8.5e-5, num_epochs between 60 and 90. These floating point hyperparameters are actually a problem for Optuna, see https://0.30000000000000004.com/.
weiji14
added
enhancement ✨
New feature or request
model 🏗️
Pull requests that update neural network model
labels
Jun 24, 2019
Check out this pull request on ReviewNB: https://app.reviewnb.com/weiji14/deepbedmap/pull/156 You'll be able to see visual diffs and write comments on notebook cells. Powered by ReviewNB. |
So that we don't have to worry about using the wrong Generator model hyperparameter settings, we now directly download the ESRGAN model's weights and hyperparameter information from Comet.ML in deepbedmap.ipynb! This commit builds upon 77b4fe1 where we refactored features/environment.py's _download_deepbedmap_model_weights_from_comet to get an arbitrary Comet.ML experiment model's weights. The function's name is now shortened to _download_model_weights_from_comet, and made to return num_residual_blocks and residual_scaling hyperparameters so we know how to build the model. Also updated snapshot on 2007tx.nc test area prediction (see previous ones at 77b4fe1 and 75266fc) using the trained model at https://www.comet.ml/weiji14/deepbedmap/abc3af8e9abc4080a6b5b44b33c537c2 giving an RMSE_test of 38.50.
weiji14
force-pushed
the
model/retune_on_round_grids
branch
from
June 25, 2019 09:18
dcb0145
to
8362dbd
Compare
Not sure why the deepbedmap.feature integration is failing (just gets stuck for >10min) since it works on the server and even on my old laptop in a docker container! Maybe because we're downloading the .npz model weights file twice? Removed the download part in the deepbedmap integration test fixture. Also re-upload test tiles covering the new 2007tx.nc rounded to 250 area.
weiji14
force-pushed
the
model/retune_on_round_grids
branch
from
June 25, 2019 09:49
8362dbd
to
8ed1273
Compare
weiji14
changed the title
Re-tune, Re-evaluate, Re-create DeepBedMap model of Antarctica
Streamline DeepBedMap model tuning, evaluation and Antarctic-wide DEM generation
Jun 25, 2019
Quick update of Pine Island Glacier prediction from e8ae274. By evaluating on the new rounded grids, the bicubic baseline RMSE value has dropped from 72.66 to 67.12 now, making our new ESRGAN model's RMSE of 63.46 look pretty insignificant. The 2007tx/2010tr/istarxx grid combination has a new slice now, and since they all have nicely rounded coordinates, we can pygmt.grdtrack the merged xarray.DataArray grid with all the points in one go! This opens up the possibility to evaluate on other groundtruth tracks crossing the Pine Island Glacier area, but having peeked at those results, I feel like there's really really really a lot more work to do...
New DeepBedMap DEM! Compare this version with the EGU version at 7a5d223 or the better v0.8.0 version at 58d8ebd. There's been a couple of hacks needed to get the full continent to come out nicely, notably by clipping the MEASURES_Ice_Velocity/W2_tile layer to a minimum value of 0.0. The newer data_prep.selective_tile script from 4a074d9 was too memory and CPU intensive for handling REMA/W1_tile (believe me, I've tried dask.distributed, all sorts of parallelization, etc on a 80 core, 200GB RAM server) so we're bringing back the old selective_tile just for that crazy layer. Also note that the refactored data_prep.selective_tile function's gapfill_raster_filepath is now renamed to 'gapfiller' as it can take in either a string filepath to a raster file, or a floating point number to be used to fill in the blank spaces! In deepbedmap.ipynb, we use this 'gapfiller' to arbitrarily fill in BEDMAP2/X_tile with -5000.0 and Arthern Accumulation/W3_tile with 0.0, noting though that this is just for cosmetic purposes as data gaps are in the ocean area outside of DeepBedMap's intended domain (within the grounding line). Another tweak we've made on the cosmetic front is in changing the colormap from BrBG_r to Blues_r that fits with the one on the README (produced using QGIS with an additional hillshading layer), oh and yes, we've updated the README.md DeepBedMap DEM snapshot too!
weiji14
added a commit
that referenced
this pull request
Aug 29, 2019
Setting fire to all that code to gapfill a raster with another raster as it's very messy, and we only need it for REMA now since we are no longer gapfilling MEaSUREs with #165 merged in. Still keeping the option to gapfill with a single floating point number, but we're removing the selective_tile_old function that has sat aside selective_tile since aac21fb in #156 of v0.9.2. Temporarily using a bilinear resampled 200m REMA in deepbedmap.ipynb. Will follow up with code to produce a gapfilled 100m resolution REMA geotiff!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Towards making a new full DeepBedMap of Antarctica, replacing the one in #136. There has been lots of significant changes to our input training datasets (e.g. #146, #150, #155), to our ESRGAN model (e.g. #151) plus software related changes since v0.8.0. Some of the scripts run slower and struggle to handle our big datasets, but they have become more geographically accurate and is now ready for prime time. Let's iron those problems out and make a new 250m spatial resolution bed elevation map of Antarctica!!!
TODO: