Update gshhs/gshhg data to GMT 5.x and GSHHG 2.3.5 #311

joferkington · 2016-07-18T00:18:51Z

This PR updates the coastline, river, and political boundary data to the latest version of GSHHS/GSHHG.

A few notes:

Based on the metadata that was stored, it's obvious that the previous data was from the shapefile versions of GSHHS and not the "raw" GMT version. Therefore, I've generated this version from the shapefiles as well, and changed the README to note that that's the preferred route. However, I've also fixed the GMT scripts so that GMT can now be used to generate the data as well.

One side effect of this is that GMT is no longer required to update the data in most cases. Only the shapefiles (a separate download) are necessary. GMT is still required to update the landmasks, but they do not change frequently at the resolution they're stored at and should only need very infrequent updates.

@WeatherGod - Finally managed to get all of this working. I've tried to keep it to as minimal edits as possible, so the scripts are still a bit crufty, but they work now.

…f "pscoast")

Previously, only readboundaries_shp.py actually wrote data in a format that basemap could read. It looks like readboundaries.py has not actually been used to produce boundary data in a _very_ long time. Instead, it's been produced from the GSHHS shapefiles.

New GSHHS versions simplify long national/state borders that are perfectly linear in geographic space. However, basemap assumes that these lines are already subsampled, and does the interpolation in between the two points in geographic space. This leads to extreme errors for long borders with only two points (e.g. the US/Canada border). Old GSHHS versions stored interpolated versions, so this is an issue that didn't come up in the past. To get around this, we'll interpolate long linear segments and store the interpolated versions.

…s been)

joferkington · 2016-07-18T00:29:47Z

A note on filesize: Previously the data folder was ~186MB. After these updates, it's ~189MB. Overall, the data is 3MB larger, though I think that just reflects new data/detail added to GSHHG.

Also, the diffs are rather hard to look at, as they include the data files, but only files in utils and lib/mpl_toolkits/basemap/data have changed (i.e. only the data itself and the scripts to update the data).

WeatherGod · 2016-07-18T16:12:41Z

This is great work, Joe! I'll give it all a spin once I figure out the travel paperwork. I'll owe you a beer at the next SciPy!

WeatherGod · 2016-07-18T20:34:30Z

The script examples/nytolondon.py fails with the following:

plotting Great Circle from New York to London (Mercator)
plotting Great Circle from New York to London (Gnomonic)
Traceback (most recent call last):
  File "/rd22/scratch/broot/Programs/basemap/examples/nytolondon.py", line 78, in <module>
    m.drawcountries(color='#e0e0e0')
  File "/rd22/scratch/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1880, in drawcountries
    self.cntrysegs, types = self._readboundarydata('countries')
  File "/rd22/scratch/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1379, in _readboundarydata
    if not poly.is_valid(): poly=poly.fix()
  File "_geoslib.pyx", line 242, in _geoslib.BaseGeometry.fix (src/_geoslib.c:2279)
  File "_geoslib.pyx", line 346, in _geoslib.Polygon.__init__ (src/_geoslib.c:3609)
IndexError: index -1 is out of bounds for axis 0 with size 0

WeatherGod · 2016-07-18T20:37:41Z

Similar error in examples/test.py:

min/max etopo20 data:
-9026.625 6228.8125
/nas/home/broot/centos6/lib/python2.7/site-packages/numpy-1.12.0.dev0+e62aa19-py2.7-linux-x86_64.egg/numpy/ma/core.py:3113: FutureWarning: Currently, slicing will try to return a view of the data, but will return a copy of the mask. In the future, it will try to return both as views.
  FutureWarning
plotting Cylindrical Equidistant example ...
+a=6378137.0 +b=6356752.31425 +lon_0=0.0 +proj=eqc +units=m 
plotting Miller Cylindrical example ...
+lon_0=0.0 +y_0=14675034.4036 +R=6370997.0 +proj=mill +x_0=-0.0 +units=m 
plotting Gall Stereographic Cylindrical example ...
+lon_0=0.0 +y_0=10875972.1816 +R=6370997.0 +proj=gall +x_0=-0.0 +units=m 
plotting Cylindrical Equal Area example ...
+lon_0=0.0 +lat_ts=30.0 +R=6370997.0 +proj=cea +x_0=17333565.4622 +units=m +y_0=7356593.66591 
plotting Mercator example ...
+lon_0=0.0 +lat_ts=20.0 +R=6370997.0 +proj=merc +x_0=-0.0 +units=m +y_0=12138729.5033 
plotting Cassini-Soldner example ...
+lon_0=-2.0 +y_0=548282.112852 +R=6370997.0 +proj=cass +x_0=291666.49891 +units=m +lat_0=54.0 
Traceback (most recent call last):
  File "/rd22/scratch/broot/Programs/basemap/examples/test.py", line 145, in <module>
    m.drawcountries()
  File "/rd22/scratch/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1880, in drawcountries
    self.cntrysegs, types = self._readboundarydata('countries')
  File "/rd22/scratch/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1379, in _readboundarydata
    if not poly.is_valid(): poly=poly.fix()
  File "_geoslib.pyx", line 242, in _geoslib.BaseGeometry.fix (src/_geoslib.c:2279)
  File "_geoslib.pyx", line 346, in _geoslib.Polygon.__init__ (src/_geoslib.c:3609)
IndexError: index -1 is out of bounds for axis 0 with size 0

WeatherGod · 2016-07-18T20:40:19Z

and examples/setwh.py:

Traceback (most recent call last):
  File "/rd22/scratch/broot/Programs/basemap/examples/setwh.py", line 29, in <module>
    m.drawcountries()
  File "/rd22/scratch/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1880, in drawcountries
    self.cntrysegs, types = self._readboundarydata('countries')
  File "/rd22/scratch/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1379, in _readboundarydata
    if not poly.is_valid(): poly=poly.fix()
  File "_geoslib.pyx", line 242, in _geoslib.BaseGeometry.fix (src/_geoslib.c:2279)
  File "_geoslib.pyx", line 346, in _geoslib.Polygon.__init__ (src/_geoslib.c:3609)
IndexError: index -1 is out of bounds for axis 0 with size 0

joferkington · 2016-07-18T20:51:05Z

At first glance it looks like these are being caused by lines with identical start and end points (and only two points in the line). I'll add a bit of logic to filter those out and update things tonight. Sorry that slipped through!

WeatherGod · 2016-07-18T20:54:11Z

run examples/hires.py. There seems to be spurious lines or something?

WeatherGod · 2016-07-18T21:18:25Z

Seems to fix all of the ones that were throwing exceptions, but hires.py still looks messed up.

joferkington · 2016-07-18T21:42:05Z

Yeah, I'm not sure about that one... I'll dig into more here in a bit. Definitely seems odd!

WeatherGod · 2016-08-25T14:42:19Z

Joe, at this point, I am ready to simply accept this as-is because it is a major improvement from before. I am also thinking that the issues we are seeing in hires.py isn't a regression, but may have been there before. Could you please do a squash rebase on this PR before I merge, please?

WeatherGod · 2016-09-22T18:11:22Z

Just did a test, and the spurious lines are not a regression. I also tried running hires.py with resolution set to 'h' to see if the spurious lines persists, and I get an exception:

[centos6] [broot@rd22 basemap]$ python examples/hires.py
Traceback (most recent call last):
  File "examples/hires.py", line 18, in <module>
    m.drawrivers()
  File "/nas/home/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 2010, in drawrivers
    self.riversegs, types = self._readboundarydata('rivers')
  File "/nas/home/broot/centos6/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1379, in _readboundarydata
    if not poly.is_valid(): poly=poly.fix()
  File "_geoslib.pyx", line 242, in _geoslib.BaseGeometry.fix (src/_geoslib.c:2279)
  File "_geoslib.pyx", line 346, in _geoslib.Polygon.__init__ (src/_geoslib.c:3609)
IndexError: index -1 is out of bounds for axis 0 with size 0

In fact, this error is raised for any resolution besides 'f'... weird.

WeatherGod · 2016-09-23T15:34:47Z

utils/README

+1) Download the shapefile formatted versions of GSHHS and extract them in the
+   "utils" directory. The data is available at: 
+   https://www.soest.hawaii.edu/pwessel/gshhg/
+   (Last updated with: ftp://ftp.soest.hawaii.edu/gshhg/gshhg-shp-2.3.5-1.zip)


Just noticed a new version of this file on the ftp server. I am going to try using this new process and see if I can get updated data files to work.

WeatherGod · 2016-09-23T16:02:10Z

utils/readboundaries_shp.py

+            x = [0, dist[i]]
+            new_x = np.arange(0, dist[i], spacing)
+            out_lon.extend(np.interp(new_x, x, lons[i:i+2]))
+            out_lat.extend(np.interp(new_x, x, lats[i:i+2]))


I am wondering if the logic here is faulty. The arange() call will never include dist[i] in its output, so then it will have a fairly significant jump to the next spot.

WeatherGod · 2016-09-23T16:03:27Z

utils/readboundaries_shp.py

+            out_lat.extend(np.interp(new_x, x, lats[i:i+2]))
+
+    out_lon.append(lons[-1])
+    out_lat.append(lats[-1])


Is it ok that in some cases, the line will have the last point twice at the end (no interpolation), and in other cases, it will have an interpolated point and then the last point?

WeatherGod · 2016-09-23T17:12:51Z

On a lark, I turned off drawing of rivers in hires.py, and the artifacts are completely gone! So, the faulty data is strictly the river data. When left alone (full resolution), they have weird spurious data which is probably getting filtered out at the lower resolutions, but the filtering might be introducing data corruption, perhaps?

WeatherGod · 2016-09-23T17:56:02Z

utils/readboundaries_shp.py

+    # not including them here...
+    for level in range(1, 6):
        poly, polymeta = get_wdb_boundaries(resolution,level,rivers=True)
        offset = 0


Sneaky little bugger... Apparently this script has never been used for processing and packaging basemap's data. The offset keeps getting reset for each level in this script, but it doesn't do that for the other readboundary.py script because it doesn't have different river levels in the dump.

If one just moves this offset to above the loop, the data produced by readboundary_shp.py works perfectly for examples/hires.py at all resolutions!

I'll create a new branch that has all the code work from @joferkington in a commit and credited to him as well, and a redo of the data packaging in a separate commit.

WeatherGod · 2016-09-23T19:47:02Z

PR #320 has been merged, so I'll close this one.

joferkington added 24 commits July 17, 2016 16:48

Update style of git commands to GMT 5.x (i.e. "gmt pscoast" instead o…

97e0640

…f "pscoast")

Minor code cleanup

9dce1a4

Update paths to reflect current basemap folder structure

0a83174

Simplify gmt commands

93e95a2

Add gshhg extraction directly into dumpbounds script

4a52e49

Updated readme to accurately reflect shoreline update process

8e0d77b

Apparently using the -A parameter with gshhg causes it to segfault

88ff18e

Update plotgrdlandmask to use new GMT netcdf convention

3d257eb

Add warning about plotgrdlandmask's memory usage

77f084c

Whitespace cleanup and ensure script actually tries to run everything

d7c981a

Level 4 no longer exists - Antarctica is now 5 (land) or 6 (ice), not 4

244aadc

Update paths to avoid manual copy

1efa3ad

Update README to accurately reflect updating process

93bdf18

Normalize imports

f5ad9c4

Do not save redundant data (keep total filesize similar to what it ha…

c629dc4

…s been)

Quantization should happen after interpolation

3298462

Update coastline/boundary data

832355d

Add fix for incorrect Antarctica polygon

70e68a4

Fix landmask updating script to avoid excessive memory use

5dc85b7

Move landmask updating script to a more descriptive name

e96defb

Update readme to mention landmasks

2f41fc9

Update raster landmasks for new GSHHS version

bee38ea

joferkington added 2 commits July 18, 2016 15:59

Avoid writing degenerate lines in boundary data

6deec73

Update line data to not have degenerate lines

eefcd57

WeatherGod reviewed Sep 23, 2016

View reviewed changes

WeatherGod mentioned this pull request Sep 23, 2016

Update data #320

Merged

WeatherGod closed this Sep 23, 2016

Update gshhs/gshhg data to GMT 5.x and GSHHG 2.3.5 #311

Update gshhs/gshhg data to GMT 5.x and GSHHG 2.3.5 #311

Uh oh!

Conversation

joferkington commented Jul 18, 2016

Uh oh!

joferkington commented Jul 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WeatherGod commented Jul 18, 2016

Uh oh!

WeatherGod commented Jul 18, 2016

Uh oh!

WeatherGod commented Jul 18, 2016

Uh oh!

WeatherGod commented Jul 18, 2016

Uh oh!

joferkington commented Jul 18, 2016

Uh oh!

WeatherGod commented Jul 18, 2016

Uh oh!

WeatherGod commented Jul 18, 2016

Uh oh!

joferkington commented Jul 18, 2016

Uh oh!

WeatherGod commented Aug 25, 2016

Uh oh!

WeatherGod commented Sep 22, 2016

Uh oh!

WeatherGod Sep 23, 2016

Choose a reason for hiding this comment

Uh oh!

WeatherGod Sep 23, 2016

Choose a reason for hiding this comment

Uh oh!

WeatherGod Sep 23, 2016

Choose a reason for hiding this comment

Uh oh!

WeatherGod commented Sep 23, 2016

Uh oh!

WeatherGod Sep 23, 2016

Choose a reason for hiding this comment

Uh oh!

WeatherGod commented Sep 23, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joferkington commented Jul 18, 2016 •

edited

Loading