Skip to content

Conversation

@JsLth
Copy link

@JsLth JsLth commented Sep 18, 2025

Hi @e-kotov, neat package, thanks!

We can make good use of the package over at denabel/gxc, but we were looking for a feature to derive polygons from identifiers. As you also intend to add such a functionality (#3), I decided to give it a go.

This pull request adds three exported functions: derive_grid() to create a spatial grid from INSPIRE IDs, inspire_extract() to extract points from an INSPIRE string, and inspire_generate() to generate an INSPIRE string from points. The latter two are taken and slightly edited from my package jslth/z22. derive_grid() is powered by inspire_extract(), the other function is what I thought would be a nice add-on.

Also, given your other remark in #3, I slightly modularized the code in create_grid_internal and added an internal function as_grid, which takes over task 7: handle output type. I re-use the function in derive_grid() to create the grid from the extracted points.

I changed the default of point_type in derive_grid() to llc, as that (AFAIK) is the established standard in INSPIRE IDs.

I haven't looked into your implementations of mirai and future yet, so as a first step, I only implemented the sequential function. Also, I have not done heavy testing. z22 runs some basic tests based on the standard cell sizes (100m, 250m, 1km, 10km, 100km), but from what I can see (correct me if I'm wrong), gridmaker also handles non-standard cell sizes. I didn't get to test derive_grid() in this manner.

A quick example demonstrating the function's use:

inspire_100km_ger <- c(
  "CRS3035RES100000mN26E43", "CRS3035RES100000mN26E44",
  "CRS3035RES100000mN27E41", "CRS3035RES100000mN27E42",
  "CRS3035RES100000mN27E43", "CRS3035RES100000mN27E44",
  "CRS3035RES100000mN27E45", "CRS3035RES100000mN28E40",
  "CRS3035RES100000mN28E41", "CRS3035RES100000mN28E42",
  "CRS3035RES100000mN28E43", "CRS3035RES100000mN28E44",
  "CRS3035RES100000mN28E45", "CRS3035RES100000mN28E46",
  "CRS3035RES100000mN29E40", "CRS3035RES100000mN29E41",
  "CRS3035RES100000mN29E42", "CRS3035RES100000mN29E43",
  "CRS3035RES100000mN29E44", "CRS3035RES100000mN29E45",
  "CRS3035RES100000mN30E40", "CRS3035RES100000mN30E41",
  "CRS3035RES100000mN30E42", "CRS3035RES100000mN30E43",
  "CRS3035RES100000mN30E44", "CRS3035RES100000mN30E45",
  "CRS3035RES100000mN30E46", "CRS3035RES100000mN31E40",
  "CRS3035RES100000mN31E41", "CRS3035RES100000mN31E42",
  "CRS3035RES100000mN31E43", "CRS3035RES100000mN31E44",
  "CRS3035RES100000mN31E45", "CRS3035RES100000mN31E46",
  "CRS3035RES100000mN32E40", "CRS3035RES100000mN32E41",
  "CRS3035RES100000mN32E42", "CRS3035RES100000mN32E43",
  "CRS3035RES100000mN32E44", "CRS3035RES100000mN32E45",
  "CRS3035RES100000mN32E46", "CRS3035RES100000mN33E40",
  "CRS3035RES100000mN33E41", "CRS3035RES100000mN33E42",
  "CRS3035RES100000mN33E43", "CRS3035RES100000mN33E44",
  "CRS3035RES100000mN33E45", "CRS3035RES100000mN33E46",
  "CRS3035RES100000mN34E40", "CRS3035RES100000mN34E41",
  "CRS3035RES100000mN34E42", "CRS3035RES100000mN34E43",
  "CRS3035RES100000mN34E44", "CRS3035RES100000mN34E45",
  "CRS3035RES100000mN34E46", "CRS3035RES100000mN35E41",
  "CRS3035RES100000mN35E42", "CRS3035RES100000mN35E43",
  "CRS3035RES100000mN35E44", "CRS3035RES100000mN35E45"
)

grid <- derive_grid(inspire_100km_ger)
#> Simple feature collection with 60 features and 3 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 4050000 ymin: 2650000 xmax: 4750000 ymax: 3650000
#> Projected CRS: ETRS89-extended / LAEA Europe
#> First 10 features:
#>                         id                       geometry   X_LLC   Y_LLC
#> 1  CRS3035RES100000mN26E43 POLYGON ((4350000 2650000, ... 4350000 2650000
#> 2  CRS3035RES100000mN26E44 POLYGON ((4450000 2650000, ... 4450000 2650000
#> 3  CRS3035RES100000mN27E41 POLYGON ((4150000 2750000, ... 4150000 2750000
#> 4  CRS3035RES100000mN27E42 POLYGON ((4250000 2750000, ... 4250000 2750000
#> 5  CRS3035RES100000mN27E43 POLYGON ((4350000 2750000, ... 4350000 2750000
#> 6  CRS3035RES100000mN27E44 POLYGON ((4450000 2750000, ... 4450000 2750000
#> 7  CRS3035RES100000mN27E45 POLYGON ((4550000 2750000, ... 4550000 2750000
#> 8  CRS3035RES100000mN28E40 POLYGON ((4050000 2850000, ... 4050000 2850000
#> 9  CRS3035RES100000mN28E41 POLYGON ((4150000 2850000, ... 4150000 2850000
#> 10 CRS3035RES100000mN28E42 POLYGON ((4250000 2850000, ... 4250000 2850000

plot(grid$geometry)
grafik

Please let me know what you think and if I should add or change anything!

@e-kotov
Copy link
Owner

e-kotov commented Sep 18, 2025

Hi, @JsLth ! Thanks for your interest and thoughtful PR as well as detailed explanation! I will need a few days to look at it and test, will try to get back to you by Monday next week.

Right away a few technical comments:

  1. Please add yourself as a co-author in DESCRIPTION, as this is definitely a significant and very welcome addition.
  2. Kindly see the errors the R CMD check is giving and address those if you can. If it does not make sense to do it until we agree on these additions, let me know.
  3. I was going to submit this package to rOpenSci as soon as I finish with the write2disk functionality in this branch https://github.com/e-kotov/gridmaker/tree/direct2disk , as this is a must if a user would want to create, say a 100m grid for entire Spain. Currently it takes very long time and more importantly it requires tens of GB of RAM. So I really want to have write to any GDAL-filetype and also to geoparquet before this goes to review at rOpenSci. That may take a few weeks, as right now I don't have a lot of time to work on it.

Last point 3 leads to:

  • the package is unlikely to be submitted to CRAN until sometime in December this year or maybe even January next year, depending on the review process at rOpenSci.
  • the package may change the function(s) name following the review.

These may be points for you to consider when relying on it in https://github.com/denabel/gxc , depending on our timeline for it's release.

Copy link
Owner

@e-kotov e-kotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JsLth, overall very solid code, but there are a few issues I identified during quick manual testing. Kindly see the comments in this code review. Would you mind addressing these before I do another test, as I could not test everything beacuse of unknown regex_match.

@e-kotov e-kotov self-requested a review October 5, 2025 09:32
@JsLth
Copy link
Author

JsLth commented Oct 5, 2025

Note that I decided to polish the heuristics of guess_resolution() to support non-standard grid sizes (in accordance with create_grid()) and to be more reliable in terms of uniformity and anisotropy. I also added the optional sample and tolerance arguments to inspire_generate() to control the guessing.

Additionally, inspire_generate() used to assume LLCs, but this might not align with many use cases. The llc argument now specifies whether the passed coordinates are centroids or LLCs. Default is llc = FALSE, because I assume most use cases involve centroids rather than LLCs. Correct me if I'm wrong.

@e-kotov
Copy link
Owner

e-kotov commented Oct 5, 2025

Default is llc = FALSE, because I assume most use cases involve centroids rather than LLCs. Correct me if I'm wrong.

I don't have that much experience with these grids. I started working on the package because of OxfordDemSci/Mapineq#194 and https://github.com/e-kotov/tphconv . Transport Poverty Hub gives back centroids in WGS84, as they assume (I guess) that the user will be using those as is. I guess very often data in the wild will be centroids of grids. We can change it later (e.g. following suggestions from rOpenSci review).

  1. @JsLth you are making the PR from your main branch. Would you mind if I push some commits (with tests and maybe some other minor adjustments that are too big to make via GitHub web interface suggestions) to your main branch (since "Maintainers are allowed to edit this pull request.")?

  2. I am also trying to grasp the logic of all three new functions and it seems to me that the inspire_generate somewhat doubles the functionality of my internal helper make_ids(). So in for modularization and avoiding doubling the features, I would consider also reusing inspire_generate in create_grid. I'm not suggesting you do it, just thinking out loud here.

  3. Finally, since:

pkgcheck::fn_names_on_cran("create_grid")
         package version     fn_name
217232 potential   0.1.0 create_grid
232007  raceland   1.1.2 create_grid
301697  sugarbag   0.1.3 create_grid
339647      xnet  0.1.11 create_grid

Maybe the name for the main function should be changed to gm_create_grid and all other functions should also get a gm_ prefix. What do you think?

@JsLth
Copy link
Author

JsLth commented Oct 5, 2025

Would you mind if I push some commits (with tests and maybe some other minor adjustments that are too big to make via GitHub web interface suggestions) to your main branch (since "Maintainers are allowed to edit this pull request.")?

Gladly, go ahead!

Maybe the name for the main function should be changed to gm_create_grid and all other functions should also get a gm_ prefix. What do you think?

AFAIK, none of these packages are very common and likely to clash with gridmaker. Also create_grid() is a pretty universal, non-conditional function. I feel like this kind of "double namespacing" is particularly useful if a function or a set of functions interacts with special objects or conditions that are only useful in the context of the package. At least to me, gm_create_grid() sounds a lot more specialized than create_grid().

If you want to prevent any namespace conflicts, what about including inspire somewhere to make it clear that the function operates in the context of INSPIRE grids? inspire(), inspire_grid(), create_inspire(), create_inspire_grid()? This would also work nicely with the INSPIRE ID helpers in this PR: inspire_grid(), inspire_derive(), inspire_generate(), inspire_extract()

@e-kotov
Copy link
Owner

e-kotov commented Oct 5, 2025

Ok, I will make some commits soon. Good catch with the res->cellsize, I was about to adjust it as my test just caught an error on this line ;)

If you want to prevent any namespace conflicts, what about including inspire somewhere to make it clear that the function operates in the context of INSPIRE grids? inspire(), inspire_grid(), create_inspire(), create_inspire_grid()? This would also work nicely with the INSPIRE ID helpers in this PR: inspire_grid(), inspire_derive(), inspire_generate(), inspire_extract()

Good suggestion, I will consider it. It also leaves the room for possible future extensions of this package beyond just INSPIRE grid.

@e-kotov
Copy link
Owner

e-kotov commented Oct 5, 2025

Note: there is still another legacy format with easting first, then northing widely used in both legacy and newer data. E.g.:

No action needed, I am working on it.

@e-kotov
Copy link
Owner

e-kotov commented Oct 6, 2025

I added tests to all functions, made them error on NAs and malformed IDs, added support for reversed short codes (with easting before northing), and added inspire_convert to quickly switch between formats (maybe it should be further upgraded to convert between short N-E and short E-N codes). All functions in this PR now support these reversed coordinates. I plan to add #10 for create_grid in a different PR.

I am also still thinking about function names. I am still having hard time distinguishing between derive, extract and generate ;) Maybe something more explicit?

create_grid -> inspire_extent_to_grid
derive_grid -> inspire_ids_to_grid
inspire_generate -> inspire_coords_to_ids
inspire_extract -> inspire_ids_to_coords
inspire_convert -> inspire_ids_convert

(which also suggests that the whole package should just be called inspire, which is btw available on CRAN)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants