Skip to content

Feature Request: geospatial column analysis #1473

Open
@proselotis

Description

Missing functionality

As a newer user to ydata-profiling, I am leveraging the package with various types of data. One data type that I am using is geospatial data, I am currently using separate columns for Latitude and Longitude as float values. However the output of a bar chart and correlation may be useful in some cases, it may be more useful to provide the output in a form of a map so the coverage of different areas can data can be seen by the user. Unless I missed something reading through the documentation, there isn't any current functionality for this.

I noticed in the contributing guidelines this was highlighted as a potential EDA: extending data type support (GPS coordinates).

Proposed feature

Plotting of points on a map within the variable exploration so you could consider the coverage areas of the data.

I am happy to help contribute on a topic like this, however there are a few different paths this could appear in so I wanted to know if there was a preference from the ydata team.

Data size

  • Based on the data size it may take to much time or compute to plot points using some packages. This could be averted by a few ways, one of my initial thoughts is to leverage data shader to make the plots.

Type of data

  • I think it might be easiest to handle a column as a type of shapely point. I think figuring out how to coordinate two separate columns will not be optimal with the current structure of ydata-profiling.
  • This could also be handled as a list of tuples to reduce dependencies

Alternatives considered

  • Ydata-profiling could leverage plotly express for graphing, however this package may struggle with plotting larger datasets.
  • Ydata-profiling could leverage Cartopy using matplotlib for graphing, I am unsure of the limitations on data size.

Additional context

No response

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions