Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ PyogrioReaderIterDataPipe for reading vector OGR files #19

Merged
merged 4 commits into from
Jun 9, 2022
Merged

Conversation

weiji14
Copy link
Owner

@weiji14 weiji14 commented Jun 9, 2022

An iterable-style DataPipe for vector data! Also added Python 3.8 job to CI build matrix which doesn't include 'vector' dependencies. That job is also skipped when PR is in draft mode.

i/O handled using pyogrio. IterDataPipe based on https://github.com/pytorch/data/blob/v0.3.0/torchdata/datapipes/iter/load/iopath.py#L37-L83

Preview at https://zen3geo--19.org.readthedocs.build/en/19/api.html#module-zen3geo.datapipes.pyogrio

Note that since pyogrio is made an optional dependency, users would need to do pip install zen3geo[vector] to install the extra 'vector' packages that includes pyogrio and geopandas.

TODO:

  • Add pyogrio as optional dependency
  • Initial implementation of PyogrioReaderIterDataPipe
  • Update CI build matrix to include optional pyogrio dependency in one run

References:

Vectorized vector I/O using OGR!
An iterable-style DataPipe for vector data! Uses pyogrio with geopandas for the I/O. Included a doctest and unit test, added a new section in the API docs and some more intersphinx mappings.
@weiji14 weiji14 added the feature New feature or request label Jun 9, 2022
@weiji14 weiji14 added this to the 0.2.0 milestone Jun 9, 2022
@weiji14 weiji14 self-assigned this Jun 9, 2022
Comment on lines +33 to +34
Extra keyword arguments to pass to
`pyogrio.read_dataframe <https://pyogrio.readthedocs.io/en/latest/api.html#geopandas-integration>`_.
Copy link
Owner Author

@weiji14 weiji14 Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the intersphinx directive would look like this, but the permalink is not available for some reason. Closest I could get to pyogrio.read_dataframe is using https://pyogrio.readthedocs.io/en/latest/api.html#geopandas-integration.

Suggested change
Extra keyword arguments to pass to
`pyogrio.read_dataframe <https://pyogrio.readthedocs.io/en/latest/api.html#geopandas-integration>`_.
Extra keyword arguments to pass to :py:func:`pyogrio.read_dataframe`.

Edit: PR at geopandas/pyogrio#130 to resolve this.

@weiji14 weiji14 force-pushed the pyogrio branch 5 times, most recently from 8d16ab1 to f65b70f Compare June 9, 2022 14:05
Making a proper build matrix now! Minimal tests (no optional dependencies) run on Python 3.8, while full tests (with all dependencies) run on Python 3.9.

Wanted to do Python 3.10 for full tests, but need to wait for rasterio 1.3.0 to come out of beta first.
@weiji14 weiji14 marked this pull request as ready for review June 9, 2022 14:24
Conserve GitHub Actions Continuous Integration resources when a Pull Request is in draft mode.
@@ -21,6 +21,7 @@ classifiers = [
python = "^3.8"
rioxarray = ">=0.10.0"
torchdata = ">=0.3.0"
pyogrio = {version = ">=0.4.0a1", extras = ["geopandas"], optional = true}
Copy link
Owner Author

@weiji14 weiji14 Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to bump this to stable pyogrio=0.4.0 version once that is released. Edit: see PR #21.

@weiji14 weiji14 merged commit f1f7652 into main Jun 9, 2022
@weiji14 weiji14 deleted the pyogrio branch June 9, 2022 14:34
weiji14 added a commit that referenced this pull request Jun 22, 2022
Using the stable version of pyogrio, patch the intersphinx link as mentioned in #19 (comment), and bumped `geopandas` version in poetry.lock from 0.10.2 to 0.11.0 to reduce number of deprecation warnings.

* 📌 Pin minimum pyogrio version to 0.4.0

Bumps [pyogrio](https://github.com/geopandas/pyogrio) from 0.4.0a1 to 0.4.0.
- [Release notes](https://github.com/geopandas/pyogrio/releases)
- [Changelog](https://github.com/geopandas/pyogrio/blob/v0.4.0/CHANGES.md)
- [Commits](geopandas/pyogrio@v0.4.0a1...v0.4.0)

* 📝 Update permalink to pyogrio.read_dataframe

Get the intersphinx link to point to https://pyogrio.readthedocs.io/en/latest/api.html#pyogrio.read_dataframe.

* ⬆️ Bump geopandas from 0.10.2 to 0.11.0

Bumps [geopandas](https://github.com/geopandas/geopandas) from 0.10.2 to 0.11.0.
- [Release notes](https://github.com/geopandas/geopandas/releases)
- [Changelog](https://github.com/geopandas/geopandas/blob/main/CHANGELOG.md)
- [Commits](geopandas/geopandas@v0.10.2...v0.11.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant