Bookdown project to update watersurfaces_refpoints #79

florisvdh · 2025-09-29T16:54:28Z

This new code builds upon existing code by @ToonHub to update watersurfaces_refpoints. Note that the latter data source is also being maintained in the n2khab-samplingframes repo.

This project is intended for maintenance of the data source at Zenodo, hence does not aim at storing the data in a repo, only the code. The data are written as vc-data (git2rdata) so that diffs should be small.

Here, version watersurfaces_refpoints_v6 is created from data source versions watersurfaces_refpoints_v5 and watersurfaces_hab_v6.

In order to make this a data source of more generic use, the GRTS address from watersurfaces_refpoints_v5 has been dropped since it can always be regenerated. Also, in_object has been renamed as in_polygon. It has been kept since it refers to polygons that may be scattered over different versions of watersurfaces_hab; it shows the spatial relation with the polygon at creation time.

Care has now been taken to be able to adopt existing points if these are overlapped by 'new' polygons (i.e. with a new polygon_id), provided that those points have in_polygon = TRUE.

The x and y coordinates assume CRS EPSG:31370, which we will need to include in Zenodo documentation.

A reading function in {n2khab} is a next step.

The checksums of the respective versions since version watersurfaces_refpoints_v4 have been stored and can be used for verification. Versions are intended to be stored at Zenodo starting with watersurfaces_refpoints_v4.

Compiled HTML: update_watersurfaces_refpoints.html.zip

To reproduce, this project requires a n2khab_data setup that supports concurrent versions, by implementing approach [3] of inbo/n2khab#113, so you have e.g.:

$ tree 20_processed/_versions/watersurfaces_*
20_processed/_versions/watersurfaces_hab
├── watersurfaces_hab_2018
│   └── watersurfaces_hab.gpkg
├── watersurfaces_hab_v4
│   └── watersurfaces_hab.gpkg
├── watersurfaces_hab_v5
│   └── watersurfaces_hab.gpkg
├── watersurfaces_hab_v6
│   └── watersurfaces_hab.gpkg
└── watersurfaces_hab_v6.1_interim
    └── watersurfaces_hab.gpkg
20_processed/_versions/watersurfaces_refpoints
├── watersurfaces_refpoints_v4
│   ├── watersurfaces_refpoints.tsv
│   └── watersurfaces_refpoints.yml
├── watersurfaces_refpoints_v5
│   ├── watersurfaces_refpoints.tsv
│   └── watersurfaces_refpoints.yml
├── watersurfaces_refpoints_v6
│   ├── watersurfaces_refpoints.csv
│   ├── watersurfaces_refpoints.tsv
│   └── watersurfaces_refpoints.yml
└── watersurfaces_refpoints_v6.1_interim
    ├── watersurfaces_refpoints.tsv
    └── watersurfaces_refpoints.yml

11 directories, 14 files

…d pkg

Apart from non-matching polygon_id, we also require a spatial non-match with existing reference points before deciding to define a new point. This more elegantly caters for the various cases of ID matching of habitatmap polygons, either through polygon_id or polygon_id_habitatmap, between watersurface_hab versions, by more directly testing spatial relationships through existing points. In consequence, some of the checks can be dropped and the whole process is both simplified and made more complete wrt adopting existing points. Note that recycled old points for polygons with a new ID still lead to adding these as new rows in watersurfaces_refpoints.

ToonHub · 2025-10-14T12:06:17Z

Nice!

Just some details:

For v5 of watersurfaces_refpoints I used the files in habitatwatersurfaces_cycle1/output of the n2khab-mhq-design repo. But I do not get the same xxh64sum. However, when comparing with the files of the v5 version in the google drive folder, the data appears to be identical. It is not clear to me, why both sources result in a different xxh64sum.
Following code only works when the directory refpoints_path_current (watersurfaces_refpoints_v6) already exists.

file.copy( file.path( refpoints_path_previous, str_c("watersurfaces_refpoints",c(".tsv", ".yml"))), refpoints_path_current, overwrite = TRUE ) %>% invisible()

So, I suggest to add following code:

if (!dir.exists(refpoints_path_current)) { dir.create(refpoints_path_current) }

ToonHub · 2025-10-14T12:43:25Z

See below the compiled HTML I get and notice that the xxh64sum of the resulting git2rdata files are different compared to the HTML you provided above. Yet, hash and the data has in the git2rdata yml file are the same. Strange...

update_watersurfaces_refpoints.zip

src/update_watersurfaces_refpoints/10_update_watersurfaces_refpoints.Rmd

florisvdh · 2025-10-24T16:00:49Z

Following code only works when the directory refpoints_path_current (watersurfaces_refpoints_v6) already exists.

file.copy( file.path( refpoints_path_previous, str_c("watersurfaces_refpoints",c(".tsv", ".yml"))), refpoints_path_current, overwrite = TRUE ) %>% invisible()

So, I suggest to add following code:

if (!dir.exists(refpoints_path_current)) { dir.create(refpoints_path_current) }

Thanks; implemented (slightly altered) in 90b307b.

florisvdh · 2025-10-24T16:08:40Z

Regarding the different checksums of local {git2rdata} files in Windows and the ones saved in the git / GitHub repo (yet pushed from the same Windows machine), we found that carriage return characters were inserted in all lines in the working directory on Windows.

This needs further attention; maybe related to ropensci/git2rdata#49 or to git or RStudio behaviour in Windows (git indeed has techniques to add/remove the CR character in Windows, depending on settings).

In the worst case, we could refrain from using file checksums and use in-memory checksums in R with digest::digest(..., algo = "xxhash64"), but it is still inconvenient with relation to file sharing and integrity, and it would pose some extra challenge for future file version checking by {n2khab} package.

To be further looked at.

florisvdh · 2025-10-24T16:09:58Z

Updated compiled HTML: update_watersurfaces_refpoints.html.zip

florisvdh added 12 commits September 22, 2025 18:06

Update watsurf_refpts: add first version of bookdown project

3663642

Update watsurf_refpts: print refpoint objects

7188427

Update watsurf_refpts, check: also print watersurfaces_refpoints

b19d9b7

Update watsurf_refpts: adjust renv settings to snapshot each installe…

06db150

…d pkg

Update watsurf_refpts, renv: snapshot codetools pkg

4cd76c4

Update watsurf_refpts, renv: snapshot n2khab@7ea9b55

e778863

Update watsurf_refpts, check: add checks for complete- & uniqueness

70b9608

Update watsurf_refpts: fix refpts_previous to meet uniqueness check

38ac443

Update watsurf_refpts: add col 'in_polygon' as requirement to recycle

70e7724

Update watsurf_refpts: deal with col 'in_polygon' in the vc files

583373d

Update watsurf_refpts: add md comment wrt the multi-point overlap case

37551ee

florisvdh requested a review from ToonHub September 29, 2025 16:54

florisvdh mentioned this pull request Sep 29, 2025

Update interim version of watersurfaces_refpoints #80

Open

florisvdh added 2 commits September 30, 2025 10:37

Update watsurf_refpts: add md comment wrt recycling of interim pts

bb9d23a

Update watsurf_refpts: minor text fix wrt in_polygon

834ac39

ToonHub approved these changes Oct 14, 2025

View reviewed changes

ToonHub reviewed Oct 15, 2025

View reviewed changes

src/update_watersurfaces_refpoints/10_update_watersurfaces_refpoints.Rmd Outdated Show resolved Hide resolved

florisvdh added 2 commits October 24, 2025 17:55

Update watsurf_refpts: create target directory if missing

90b307b

Update watsurf_refpts: increase digits arg (thanks @ToonHub)

a17a9d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Bookdown project to update watersurfaces_refpoints #79

Bookdown project to update watersurfaces_refpoints #79

Uh oh!

florisvdh commented Sep 29, 2025

Uh oh!

ToonHub commented Oct 14, 2025

Uh oh!

ToonHub commented Oct 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

florisvdh commented Oct 24, 2025

Uh oh!

florisvdh commented Oct 24, 2025

Uh oh!

florisvdh commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Bookdown project to update watersurfaces_refpoints #79

Are you sure you want to change the base?

Bookdown project to update watersurfaces_refpoints #79

Uh oh!

Conversation

florisvdh commented Sep 29, 2025

Uh oh!

ToonHub commented Oct 14, 2025

Uh oh!

ToonHub commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

florisvdh commented Oct 24, 2025

Uh oh!

florisvdh commented Oct 24, 2025

Uh oh!

florisvdh commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ToonHub commented Oct 14, 2025 •

edited

Loading