Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consolidate_intersections seems to miss some nodes, and has edges has inconsistent integer types #1241

Closed
3 tasks done
LBasara opened this issue Dec 4, 2024 · 5 comments
Closed
3 tasks done
Labels

Comments

@LBasara
Copy link

LBasara commented Dec 4, 2024

Contributing guidelines

  • I understand the contributing guidelines

Documentation

  • My problem is not addressed by the documentation or examples

Existing issues

  • My problem does not appear in an existing issue

What operating system and Python version are you using?

Ubuntu 24.10, Python 3.12.7

What OSMnx version are you using?

v2.0.0

Environment packages and versions

Package                   Version
------------------------- --------------
anyio                     4.6.2.post1
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
asttokens                 3.0.0
async-lru                 2.0.4
attrs                     24.2.0
babel                     2.16.0
beautifulsoup4            4.12.3
bleach                    6.2.0
branca                    0.8.0
certifi                   2024.8.30
cffi                      1.17.1
charset-normalizer        3.4.0
comm                      0.2.2
contourpy                 1.3.1
cycler                    0.12.1
debugpy                   1.8.9
decorator                 5.1.1
defusedxml                0.7.1
executing                 2.1.0
fastjsonschema            2.21.1
folium                    0.18.0
fonttools                 4.55.1
fqdn                      1.5.1
geopandas                 1.0.1
h11                       0.14.0
httpcore                  1.0.7
httpx                     0.28.0
idna                      3.10
ipykernel                 6.29.5
ipython                   8.30.0
isoduration               20.11.0
jedi                      0.19.2
jinja2                    3.1.4
joblib                    1.4.2
json5                     0.10.0
jsonpointer               3.0.0
jsonschema                4.23.0
jsonschema-specifications 2024.10.1
jupyter-client            8.6.3
jupyter-core              5.7.2
jupyter-events            0.10.0
jupyter-lsp               2.2.5
jupyter-server            2.14.2
jupyter-server-terminals  0.5.3
jupyterlab                4.3.2
jupyterlab-pygments       0.3.0
jupyterlab-server         2.27.3
kiwisolver                1.4.7
mapclassify               2.8.1
markupsafe                3.0.2
matplotlib                3.9.3
matplotlib-inline         0.1.7
mistune                   3.0.2
nbclient                  0.10.1
nbconvert                 7.16.4
nbformat                  5.10.4
nest-asyncio              1.6.0
networkx                  3.4.2
notebook-shim             0.2.4
numpy                     2.1.3
osmnx                     2.0.0
overrides                 7.7.0
packaging                 24.2
pandas                    2.2.3
pandocfilters             1.5.1
parso                     0.8.4
pexpect                   4.9.0
pillow                    11.0.0
platformdirs              4.3.6
prometheus-client         0.21.1
prompt-toolkit            3.0.48
psutil                    6.1.0
ptyprocess                0.7.0
pure-eval                 0.2.3
pycparser                 2.22
pygments                  2.18.0
pyogrio                   0.10.0
pyparsing                 3.2.0
pyproj                    3.7.0
python-dateutil           2.9.0.post0
python-json-logger        2.0.7
pytz                      2024.2
pyyaml                    6.0.2
pyzmq                     26.2.0
referencing               0.35.1
requests                  2.32.3
rfc3339-validator         0.1.4
rfc3986-validator         0.1.1
rpds-py                   0.22.1
scikit-learn              1.5.2
scipy                     1.14.1
send2trash                1.8.3
setuptools                75.6.0
shapely                   2.0.6
six                       1.16.0
sniffio                   1.3.1
soupsieve                 2.6
stack-data                0.6.3
terminado                 0.18.1
threadpoolctl             3.5.0
tinycss2                  1.4.0
tornado                   6.4.2
traitlets                 5.14.3
types-python-dateutil     2.9.0.20241003
tzdata                    2024.2
uri-template              1.3.0
urllib3                   2.2.3
wcwidth                   0.2.13
webcolors                 24.11.1
webencodings              0.5.1
websocket-client          1.8.0
xyzservices               2024.9.0

How did you install OSMnx?

Other (describe below)

Problem description

Install

Initialized new uv environment, then
uv add jupyterlab 'osmnx>=2' folium geopandas
further added latplotlib and mapclassify for folium visualizatio,

What did you do?

I love the consolidate_intersections function.
I used it on a small bbox (length 6*2km) , used a tolerance of 200m.
I expected to see nodes at each "visual" intersection of edges.

What actually happened instead

There seems to be no nodes where I would expect ones, notably at some dead ends (brown potato), and at road intersections (black potatoes). Red potatoes is at the edge of the bbox so it's ok. The image is zoomed at the west, south-west of the code below.

image

In addition, there appears to be a mix between the edges formats of the reconstructed Graph (edge source is Python int, edge target is np.int64
:
image

Complete minimal reproducible example

import osmnx as ox

lat, lon=48.1, 0
G = ox.graph_from_point((lat, lon), dist=6000)
Gproj= ox.projection.project_graph(G)
Gcons=ox.simplification.consolidate_intersections(Gproj, tolerance=200)
dfcn, dfce=ox.graph_to_gdfs(Gcons)
m=dfce.explore()
dfcn.explore(m=m, color='red')
@LBasara LBasara added the bug label Dec 4, 2024
@gboeing
Copy link
Owner

gboeing commented Dec 5, 2024

I expected to see nodes at each "visual" intersection of edges... There seems to be no nodes where I would expect ones

As far as I can tell, it's working as expected given how you parameterized it. You set tolerance=200 which means you're agglomeratively merging together nodes within 400 meters of each other (i.e., a 200 meter radius) along the network. This is just a really huge tolerance given how dense your network is, so you end up building huge clusters that chain things together until hundreds of nodes all merge into one. If you change to, say, tolerance=20, you'll see results that likely match your expectations better.

In addition, there appears to be a mix between the edges formats of the reconstructed Graph (edge source is Python int, edge target is np.int64

Yes I think this is the result of recent updates to pandas or numpy... maybe pandas factorize. Essentially, OSMnx nodes are Python int, but the consolidated nodes get assigned new IDs, which used to also be Python int, but pandas now seems to provide as np.int64. They should work the same in practice, but it's kind of ugly (and weird to have multiple and redundant data types). I'm open to a PR to homogenize the types if it doesn't create a noticeable performance hit.

@gboeing
Copy link
Owner

gboeing commented Dec 10, 2024

There seems to be some inconsistency with how NetworkX handles data with pandas>=2.

For example, if we run this snippet with Networkx v3.4 and pandas v1.5.3:

import networkx as nx
import pandas as pd

# nodes come saved as a path in a pandas series
nodes = pd.Series([333, 555])

# add the nodes and the edge
G = nx.MultiDiGraph()
G.add_nodes_from(nodes)
G.add_edge(nodes[0], nodes[1])

print(G.nodes)  # [333, 555]
print(G.edges)  # [(333, 555, 0)]

...the resulting graph nodes are of type int, and the edge is a tuple of ints, as expected.

However, if we run that snippet again, but this time with Networkx v3.4.2 and pandas v2.2.3:

import networkx as nx
import pandas as pd

# nodes come saved as a path in a pandas series
nodes = pd.Series([333, 555])

# add the nodes and the edge
G = nx.MultiDiGraph()
G.add_nodes_from(nodes)
G.add_edge(nodes[0], nodes[1])

print(G.nodes)  # [333, 555]
print(G.edges)  # [(333, np.int64(555), 0)]

...the resulting graph nodes are still of type int, and the edge's u is an int, but the edge's v is type np.int64. Recent versions of pandas use these numpy types, but why would NetworkX inconsistently make the u a Python int but make the v a np.int64?

@LBasara this seems to be what you're seeing with these inconsistent types after consolidating intersections in your graph. It may be worth asking NetworkX in an upstream issue, since this didn't use to happen and this inconsistency probably shouldn't be happening.

Edit: for reference, these inconsistent edge u and v value types are being added to the graph by OSMnx here.

@gboeing
Copy link
Owner

gboeing commented Dec 10, 2024

I opened an upstream issue at networkx/networkx#7763 to see if we get any insights.

@gboeing
Copy link
Owner

gboeing commented Dec 13, 2024

Per the conversation in networkx/networkx#7763, this is an upstream issue with pandas's inconsistency in return types when one is accessing objects with __getitem__ vs iterating over items. It only shows up now because numpy>=2 now displays values along with their data type. This is what you're seeing.

You can use numpy.set_printoptions to prevent this visual inconsistency in the results:

import numpy as np
import osmnx as ox

point = 48.1, 0
G = ox.graph.graph_from_point(point, dist=500)
Gp = ox.projection.project_graph(G)
Gc = ox.simplification.consolidate_intersections(Gp, tolerance=10)

print(list(Gc.edges)[:3])
# prints [(0, np.int64(1), 0), (1, np.int64(4), 0), (1, np.int64(0), 0)]

np.set_printoptions(legacy="1.25")
print(list(Gc.edges)[:3])
# prints [(0, 1, 0), (1, 4, 0), (1, 0, 0)]

@gboeing gboeing closed this as completed Dec 13, 2024
@gboeing
Copy link
Owner

gboeing commented Dec 13, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants