Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shattering #84

Merged
merged 118 commits into from
Oct 16, 2024
Merged

Shattering #84

merged 118 commits into from
Oct 16, 2024

Conversation

raphaellaude
Copy link
Collaborator

@raphaellaude raphaellaude commented Sep 9, 2024

Todo

  • Add/update schema for shattering
    • DistrictrMap
    • ParentChildEdges
    • GerryDBTable: remove tiles path, which should go in DistrictrMap now
  • Update management commands
    • Compute intersection of two gerrydb tables and append to parent child table
    • Delete parent child relationships for a gerrydb view
    • Simplify command to generate tilesets
    • Create separate command to join two gerrydb view tilesets into one
    • Create materialized view of parent+child gerrydb tables
    • Command to create districtr map
    • Command to patch districtr map. Optional but would be nice not to have to use psql Let's do this in a future PR
  • Update endpoints
    • List districtr map views
    • Update create document to take districtr map uuid instead of gerrydb table name. Existing endpoint will still work
    • Update GET document endpoint to return tiles path again, joining from DistrictrMap instead of gerrydbtable, now
  • Shatter geometries endpoint
    • Should get children from parent<>child table
    • Update assignments
    • Returning children
  • Frontend
    • Context menu
    • Button in menu for shattering polygons
    • setFilter logic after hitting shatter endpoint
    • Remember shattered parents on page reload (how to do this? Decided we should do a join on parentchildedges when sending assignments on page load)
    • Get it to build!
  • Migrations
    • Create migration script to create updated districtrmap records from existing gerrydbtable records

Description

Reviewers

  • Primary:
  • Secondary:

Checklist

  • Added/Updated related documentation (if applicable).
  • Added/Updated related unit tests (if applicable).

Screenshots (if applicable):

alembic migration script
@raphaellaude raphaellaude self-assigned this Sep 9, 2024
@raphaellaude raphaellaude added enhancement New feature or request API frontend CLI For managing gerrydb views, tiles, etc. anything non-public roadmap layers Contextual map layers including the basemap labels Sep 9, 2024
add command for creating districtrmap
# We'll want to enforce the constraint tha the gerrydb_table_name is either in
# GerrydbTable.name or a materialized view of two GerryDBTables some other way.
gerrydb_table_name: str | None = Field(nullable=True)
# Null means default number of districts? Should we have a sensible default?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(O) Sensible default hard to set globally, IMO, maybe later we are more geo-aware and can default to the num of congressional districts in a state but for now I think null is fine.

backend/app/models.py Outdated Show resolved Hide resolved
backend/app/models.py Outdated Show resolved Hide resolved
intial draft of shatter parent udf
but still not swapping the parents and children in shatter functions correctly

getting empty set of children inserted
in the future may want to pass the districtrmap uuid to shatter map so that we don't need to enfore unique gerrydb table name
@raphaellaude
Copy link
Collaborator Author

@bailliekova this is ready for an intermediate review! don't have a couple last things in place but would love to get your eyes on the mostly-done backend implementation, if possible!

So much to say about this but wanted to flag a couple things:

  • I set districtrmap to have a unique gerrydb_table_name. This is probably not what we want since we might have multiple districtr maps rely on the same underlying views (e.g. congressional vs. local maps?). I think to do this we'd need to send the districtrmap UUID to the client and remap shatter_parents to take the districtrmap UUID and parent geoids as params rather than the layer name and params.
  • I need to test more edge cases. For now, the tests for the new features are basically just coverage tests. Not totally sure how to test the procedure since we'd need to hit the db again to pull the inserted records?

@bailliekova
Copy link
Collaborator

What's our thought on an un-shatterable map? Is this if the "parent" is set to block groups? or a non-nesting geometry?

Copy link
Collaborator

@bailliekova bailliekova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good

"-lco",
"GEOMETRY_NAME=geometry",
"-nlt",
"MULTIPOLYGON",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(q) is this changing functionality or just being more explicit/constraining the output?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not changing functionality, at least in a breaking way! ogr2ogr is kind of annoying in that for e.g. GeoJSON's it'll load the geometry as wkb_geometry by defaylt, so this way we make sure the data type is correct / validated on load and the geometry column name won't come in as geometry, geom, geography (though technically geography is a different UDT in PostGIS, I learned recently)

AND a.geo_id = ANY(parent_geoids)
AND edges.districtr_map = districtr_map_uuid
) child_geoids
ON CONFLICT (document_id, geo_id) DO UPDATE SET zone = EXCLUDED.zone
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(q) is this just for safety? where do we expect a conflict?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for safety but we shouldn't see a conflict. Do you think it's actually more safe to fail loudly in this case? I was going back and forth

INNER JOIN parentchildedges edges
ON edges.parent_path = a.geo_id
WHERE a.document_id = $1
AND a.geo_id = ANY(parent_geoids)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(o) might try scooting this into the join condition- I don't have strong intuition if it'll be a a better execution plan, definitely might be.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try running EXPLAIN ANALYZE!

backend/app/models.py Show resolved Hide resolved
@raphaellaude
Copy link
Collaborator Author

What's our thought on an un-shatterable map? Is this if the "parent" is set to block groups? or a non-nesting geometry?

I set the child_layer in DistrictrMap to nullable – figured this would be a good flag to the frontend whether the map is shatterable or not. WDYT?

@raphaellaude
Copy link
Collaborator Author

Seeing one subtle bug here where after coming back to the page the children seem to have been lost in the state.

subtle_shattering_bug.mp4

@raphaellaude raphaellaude marked this pull request as ready for review October 14, 2024 12:53
@nofurtherinformation
Copy link
Collaborator

Seeing one subtle bug here where after coming back to the page the children seem to have been lost in the state.

subtle_shattering_bug.mp4

This is related to #119 -- I have a draft PR to address this issue (and enable redux devtools) in #120

Copy link
Collaborator

@mariogiampieri mariogiampieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great save for the radix property, which does still allow me to click and remove parent without replacing with children. overall amazing work

app/src/app/components/ContextMenu.tsx Outdated Show resolved Hide resolved
app/src/app/components/sidebar/GerryDBViewSelector.tsx Outdated Show resolved Hide resolved
@mariogiampieri
Copy link
Collaborator

Screen.Recording.2024-10-14.at.3.14.47.PM.mov

I may have spoken too soon - it appears that if you shatter an unassigned block, it is not replaced by unassigned children. We should decide whether we require assignment before splitting or not and address accordingly. My sense is that we would want users to be able to shatter and then assign

)
op.drop_constraint("districtr_map_uuid_unique", "districtrmap", type_="unique")
op.execute(
sa.text("""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is nice, I was worried about our migrations. I like this better than either having to create new files for new versions or do something very convoluted to get prior versions of functions.

bailliekova
bailliekova previously approved these changes Oct 15, 2024
Copy link
Collaborator

@bailliekova bailliekova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

ParentChildEdges.parent_path,
)
.join(Document, Assignments.document_id == Document.document_id)
.join(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(q) is this reasonably quick? a little nervous about having two joins just to retrieve assignments. I suppose this is only called on load, though, so it's not like we're doing it repeatedly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. I'm seeing ~250-300ms query execution time, ~500ms round trip locally. For UX, this happens concurrently with tile loading, which usually takes about the same time or slightly longer

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I barely noticed a change if any

DistrictrMap,
Document.gerrydb_table == DistrictrMap.gerrydb_table_name,
isouter=True,
)
.limit(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(pp) not caused by this PR, but just noticing- limit(1) is going to prevent run time errors if we do for some reason violate uniqueness constraints, but, uh, we have uniqueness constraints on the database. I think we should be consistent-- pair result.one() with a .limit(1) each time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally but I'd prefer to actually remove the limit and handle the runtime error, allowing us to send an informative error and catch bugs earlier. Bugs ofc not caused by erroneous db state-because we have our constraints in place-but in our queries

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that said, let's address this in a separate mini pr

@nofurtherinformation
Copy link
Collaborator

Screen.Recording.2024-10-14.at.3.14.47.PM.mov
I may have spoken too soon - it appears that if you shatter an unassigned block, it is not replaced by unassigned children. We should decide whether we require assignment before splitting or not and address accordingly. My sense is that we would want users to be able to shatter and then assign

The shattering UDF has been updated, and this should be resolved.

@raphaellaude
Copy link
Collaborator Author

The shattering UDF has been updated, and this should be resolved.

To reviewers, cc @mariogiampieri, if you're testing locally you'll need to re-run migrations or use psql to redefine the backend/app/sql/shatter_parent.sql UDF.

Here's the fix in action:

shattering_unassigned.mp4

@nofurtherinformation nofurtherinformation merged commit 1b1aa50 into main Oct 16, 2024
@nofurtherinformation nofurtherinformation deleted the shattering branch October 16, 2024 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API CLI For managing gerrydb views, tiles, etc. anything non-public enhancement New feature or request frontend layers Contextual map layers including the basemap roadmap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shattering
4 participants