Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add GTFS Loader #150

Merged
merged 28 commits into from
Feb 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
b363049
chore: add gtfs_kit dependency
piotrgramacki Dec 23, 2022
7aaf411
chore: add trips count to GTFSLoader
piotrgramacki Dec 28, 2022
b286a72
fix: remove time_resolution from loader init
piotrgramacki Jan 10, 2023
3778e9c
chore: load directions from gtfs feed
piotrgramacki Jan 10, 2023
8823bff
Merge branch 'main' into 145-implement-gtfs-loader
piotrgramacki Jan 10, 2023
161ea14
fix: update pdm.lock
piotrgramacki Jan 10, 2023
37955c8
Merge branch 'main' into 145-implement-gtfs-loader
piotrgramacki Jan 15, 2023
22f48d4
fix: make gtfs_kit import optional, fix imports in __init__.py, use W…
piotrgramacki Jan 15, 2023
719fff3
chore: add example notebook for GTFSLoader
piotrgramacki Jan 15, 2023
1337290
chore: add GTFS feed validation
piotrgramacki Jan 16, 2023
ce21888
fix: use warnings for gtfs validation error
piotrgramacki Jan 16, 2023
a7c21b7
test: add pytest-mock dependency and gtfs loader validation tests
piotrgramacki Jan 16, 2023
cac509e
fix: handle faulty B028 error from flake8
piotrgramacki Jan 16, 2023
4b59f0d
test: add GTFSLoader tests
piotrgramacki Jan 20, 2023
bead054
chore: update CHANGELOG
piotrgramacki Jan 20, 2023
e3edd50
fix: properly type GTFSLoader methods
piotrgramacki Jan 20, 2023
34361a4
fix: add missing no cover comment
piotrgramacki Jan 20, 2023
7370133
fix: improve load docstring
piotrgramacki Jan 21, 2023
5f1fe7c
fix: remove newest gtfs_kit version from requirements and add min sup…
piotrgramacki Jan 25, 2023
f4d9abd
fix: extract departure time parsing to method
piotrgramacki Jan 25, 2023
0dac59d
test: reuse fixtures which are not modified
piotrgramacki Jan 25, 2023
8d3d189
fix: update wrong fixture scope
piotrgramacki Jan 25, 2023
e56a781
fix: download example gtfs without wget
piotrgramacki Jan 28, 2023
a24e8c2
test: add gtfs_kit to optional dependencies tests
piotrgramacki Jan 28, 2023
2cbd960
Merge branch 'main' into 145-implement-gtfs-loader
piotrgramacki Feb 8, 2023
44c2a3e
fix: resolve conflicts with main
piotrgramacki Feb 9, 2023
0a7941b
fix(pre-commit.ci): auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 9, 2023
b45cbd1
fix: add requests type stubs to mypy
piotrgramacki Feb 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .flake8
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[flake8]
max-line-length = 100
max-doc-length = 100
extend-ignore = E203
extend-ignore = E203,B028
exclude =
.git,
.venv,
Expand Down
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ repos:
rev: v0.991
hooks:
- id: mypy
additional_dependencies: ["types-requests"]
- repo: https://github.com/pdm-project/pdm
rev: 2.4.3
hooks:
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased] - 2022-MM-DD

### Added
- GTFS Loader from gtfs2vec paper

### Changed
- Change embedders and joiners interface to have `.transform` method
Expand Down
1 change: 1 addition & 0 deletions examples/loaders/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
Examples illustrating the usage of every Loader.

- [GeoparquetLoader](geoparquet_loader.ipynb)
- [GTFSLoader](gtfs_loader.ipynb)
2 changes: 2 additions & 0 deletions examples/loaders/files/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# example GTFS used in notebook
example.zip
127 changes: 127 additions & 0 deletions examples/loaders/gtfs_loader.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# GTFS Loader Example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"from srai.loaders import GTFSLoader\n",
"import gtfs_kit as gk\n",
"import geopandas as gpd\n",
"import numpy as np\n",
"from shapely.geometry import Point\n",
"from srai.utils.constants import WGS84_CRS\n",
"from utils import download"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download an example GTFS feed from Wroclaw, Poland\n",
"\n",
"In this notebook we use the GTFS feed for Wroclaw, Poland as an example, which is available in Wroclaw's open data repository[1]. This download uses transitfeeds.com[2] to download the feed, but you can also download the feed directly from the Wroclaw open data repository.\n",
"\n",
"1. https://www.wroclaw.pl/open-data/dataset/rozkladjazdytransportupublicznegoplik_data\n",
"2. https://transitfeeds.com/p/mpk-wroc-aw/663"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"wroclaw_gtfs = Path().resolve() / \"files\" / \"example.zip\"\n",
"gtfs_url = \"https://transitfeeds.com/p/mpk-wroc-aw/663/20221221/download\"\n",
"\n",
"download(gtfs_url, wroclaw_gtfs.as_posix())"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Peek at the feed using `gtfs_kit` directly"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"feed = gk.read_feed(wroclaw_gtfs, dist_units=\"km\")\n",
"\n",
"stops_df = feed.stops[[\"stop_id\", \"stop_lat\", \"stop_lon\"]].set_index(\"stop_id\")\n",
"stops_df[\"geometry\"] = stops_df.apply(lambda row: Point(row[\"stop_lon\"], row[\"stop_lat\"]), axis=1)\n",
"\n",
"stops_gdf = gpd.GeoDataFrame(\n",
" stops_df,\n",
" geometry=\"geometry\",\n",
" crs=WGS84_CRS,\n",
")\n",
"\n",
"stops_gdf.plot(markersize=1)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use GTFSLoader to load stops statistics from the feed"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gtfs_loader = GTFSLoader()\n",
"trips_gdf = gtfs_loader.load(wroclaw_gtfs)\n",
"\n",
"print(trips_gdf.columns)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.14"
},
"vscode": {
"interpreter": {
"hash": "f39c7279c85c8be5d827e53eddb5011e966102d239fe8b81ca4bd9f0123eda8f"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
28 changes: 28 additions & 0 deletions examples/loaders/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""Utility functions for loaders examples."""
import requests
from tqdm import tqdm


def download(url: str, fname: str, chunk_size: int = 1024) -> None:
piotrgramacki marked this conversation as resolved.
Show resolved Hide resolved
"""
Download a file with progress bar.

Args:
url (str): URL to download.
fname (str): File name.
chunk_size (str): Chunk size.

Source: https://gist.github.com/yanqd0/c13ed29e29432e3cf3e7c38467f42f51
"""
resp = requests.get(url, stream=True)
total = int(resp.headers.get("content-length", 0))
with open(fname, "wb") as file, tqdm(
desc=fname.split("/")[-1],
total=total,
unit="iB",
unit_scale=True,
unit_divisor=1024,
) as bar:
for data in resp.iter_content(chunk_size=chunk_size):
size = file.write(data)
bar.update(size)
Loading