Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shattering #13

Closed
raphaellaude opened this issue Aug 1, 2024 · 5 comments · Fixed by #84
Closed

Shattering #13

raphaellaude opened this issue Aug 1, 2024 · 5 comments · Fixed by #84
Assignees
Labels
API enhancement New feature or request frontend

Comments

@raphaellaude
Copy link
Collaborator

No description provided.

@mariogiampieri
Copy link
Collaborator

mariogiampieri commented Sep 3, 2024

Reflecting on this and the paint by county functionality- I think it would be worth testing precomputing vtd and county membership of blocks on ingestion / tile creation so that we can paint by county and shatter based on IDs instead of using turf in the browser.

If we were to use turf to shatter, I imagine we would:

  • get VTD geom / id based on user selection
  • use queryRenderedFeatures and the like a la current getFeaturesIntersectingCounties to use turf to select features that are members
  • set an expression to show those features

If we used an ID, I'd image we'd:

  • get VTD geom / id based on user selection
  • use queryRenderedFeatures, loop through features to see if vtdId matches the search feature
  • set an expression to show those features

Thoughts? Is this silly? It's a bit less elegant maybe but I'd rather do spatial operations once upon ingestion and then lookup on attributes instead of repeated spatial lookups. I imagine that if this is performant we could also test in on the PaintByCounty (although the actual paint part does seem snippy!)

@raphaellaude
Copy link
Collaborator Author

Reflecting on this and the paint by county functionality- I think it would be worth testing precomputing vtd and county membership of blocks on ingestion

I think we should 100% compute block -> parent memberships via the CLI and do no heavy geometry operations like calculating intersections in the browser (i.e. not shatter in browser).

I'm 50/50 on adding county membership to tiles. TLDR / train-of-thought thoughts:

We should weigh the benefits of a simpler implementation (+1 for adding the county FIPS attribute to tiles) vs. more flexible† (+1 for computing in browser) and perf–the tile rendering is already getting a tad slow on the initial load, esp. for block layers, though you could easily imagine that a fullproof implementation with turf in the browser would get much hairier and less performant, not to mention bundle size concerns.

Re: implementation if we add county FIPS to tilesets:

If we used an ID, I'd image we'd:

  • get VTD geom / id based on user selection
  • use queryRenderedFeatures, loop through features to see if vtdId matches the search feature
  • set an expression to show those features

A much simpler implementation is to do what Districtr v1 does and use querySourceFeatures using a mapbox expression to query features matching the county FIPS code.

† More flexible because the function is actually intersection layer agnostic. We could paint by Places, VTDs (e.g. if we had a block layer), arbitrary geo etc. using this implementation.

@raphaellaude
Copy link
Collaborator Author

raphaellaude commented Sep 5, 2024

Draft proposal for how to do this (and will be my first attempt at this feature). Lmk what you think @mariogiampieri @bailliekova

Generating the GerryDB view table and tiles

In general I think it makes sense for shatterable maps to tightly couple the child layer to the parent layer. As such, I'd recommend the GerryDB view in the DB be a union of the child and parent layer with an identical schema.

Advantages:

  • Matching parent and child schemas are enforced
  • A single tileset can contain both child and parent layers. Fewer sources to manage.
  • Stricter, less likely to generate undefined behavior
  • More flexible from DB's perspective. If we pull a GerryDB view, that won't have cascading impacts
  • Backwards compatible / can still support unshatterable maps easily

Disadvantages:

  • User may not know whether their map is shatterable. Will need to figure out how to communicate this to the FE (e.g. based on the number of layers in the block source? I'd rather not pass a boolean flag which is a code smell). In practice, this could be mitigated by supporting shattering for nearly all maps and only flagging that a map isn't shatterable.
  • Data duplication
  • Less flexible in the sense that arbitrary maps can't be shattered
  • Can't support different summary metrics for child vs parent layers (if that's something we could even want)
  • Will require rethinking how we handle GerryDB table names–which currently need to match the source layer name. This is probably something that's good to do anyway because it's a weird hidden thing.

Implementation:

  1. Add --child option to load GerryDB CLI command (or separate endpoint?)
  2. In a single transaction ideally, load each view, using force replace for first layer and append for child layer. If loading fails, abort transaction
  3. Calculate child parent relations w/ postGIS on loaded layers. Assert that all children have one and only one parent. Assert all parents have at least one child. Abort if asserts not met, tearing down loaded views.
  4. Insert record to GerryDB metadata table
  5. Insert two records to GerryDB layer metadata table (new table 📑)
  6. Insert child parent relations (which children are contained in which parent) (new table 📑)
  7. Generate tilesets
  8. Join tilesets
  9. Upload tilesets to R2

Shattering a parent

Note

Un-shattering is a pretty different feature and not covered here,
though the logic from the data prep stage will be reusable.

On document change, add layers:

  1. Add parent layer without filters
  2. Add child layer filtering out all records

User splits parent: (from map context menu?)

  1. Hit {host}/api/document/{document-id}/shatter/{parent-geoid}
  2. Update document (one query/transaction)
    1. Pull parent's children from DB
    2. Update document, removing parent and adding children with the same zone assignment
    3. Return children
  3. Mutation onSuccess
    1. async setFeatureState of child layer to have returned zone
    2. async update filters on layers, hiding shattered parent and showing children
    3. Update map store
  4. Metrics should update automatically and theoretically shouldn't change if the child pops sum up to parent pops

@raphaellaude raphaellaude self-assigned this Sep 5, 2024
@raphaellaude
Copy link
Collaborator Author

raphaellaude commented Sep 6, 2024

Some adjustments to the above based on a discussion with @bailliekova:

  • Loading
    • Let's load each GerryDB separately but generate views in Postgres of the unioned child and parent layers.
    • This should allow for more modularity with tables / less data duplication / be easier to understand.
    • This view will have a constrained schema consisting of the shared columns between parent and child tables.
  1. Insert two records to GerryDB layer metadata table (new table 📑)
  • Instead of this proposed GerryDB layer metadata model, let's create a new model DistrictrMap (better name?).
    • This model will sit between the GerryDB views and the FE and can contain other important info like the map name, description, GerryDB view name (whether single layer or unioned views view), parent tiles layer name, child tiles layer name, state, number of districts supported by a map, etc. i.e. stuff that doesn't belong to a GerryDB view but is 1-1 with the map a user is drawing.
    • An advantage of this system is it becomes much easier to create and update the constituent parts of a DistrictrMap in isolation and assemble the map attributes later (or update them). Right now, by contrast, it is possible for the GerryDB table to be missing a tile path. This incomplete state can lead to issues.

@raphaellaude
Copy link
Collaborator Author

Since we don't expect the UNION'ed tables to be changing at all, we can materialize them and create indices over path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API enhancement New feature or request frontend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants