-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shattering #84
Shattering #84
Conversation
update deps
alembic migration script
add command for creating districtrmap
# We'll want to enforce the constraint tha the gerrydb_table_name is either in | ||
# GerrydbTable.name or a materialized view of two GerryDBTables some other way. | ||
gerrydb_table_name: str | None = Field(nullable=True) | ||
# Null means default number of districts? Should we have a sensible default? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(O) Sensible default hard to set globally, IMO, maybe later we are more geo-aware and can default to the num of congressional districts in a state but for now I think null is fine.
move helpers functions into app
intial draft of shatter parent udf
but still not swapping the parents and children in shatter functions correctly getting empty set of children inserted
in the future may want to pass the districtrmap uuid to shatter map so that we don't need to enfore unique gerrydb table name
@bailliekova this is ready for an intermediate review! don't have a couple last things in place but would love to get your eyes on the mostly-done backend implementation, if possible! So much to say about this but wanted to flag a couple things:
|
What's our thought on an un-shatterable map? Is this if the "parent" is set to block groups? or a non-nesting geometry? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good
"-lco", | ||
"GEOMETRY_NAME=geometry", | ||
"-nlt", | ||
"MULTIPOLYGON", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(q) is this changing functionality or just being more explicit/constraining the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not changing functionality, at least in a breaking way! ogr2ogr
is kind of annoying in that for e.g. GeoJSON's it'll load the geometry as wkb_geometry
by defaylt, so this way we make sure the data type is correct / validated on load and the geometry column name won't come in as geometry
, geom
, geography
(though technically geography is a different UDT in PostGIS, I learned recently)
AND a.geo_id = ANY(parent_geoids) | ||
AND edges.districtr_map = districtr_map_uuid | ||
) child_geoids | ||
ON CONFLICT (document_id, geo_id) DO UPDATE SET zone = EXCLUDED.zone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(q) is this just for safety? where do we expect a conflict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for safety but we shouldn't see a conflict. Do you think it's actually more safe to fail loudly in this case? I was going back and forth
backend/app/sql/shatter_parent.sql
Outdated
INNER JOIN parentchildedges edges | ||
ON edges.parent_path = a.geo_id | ||
WHERE a.document_id = $1 | ||
AND a.geo_id = ANY(parent_geoids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(o) might try scooting this into the join condition- I don't have strong intuition if it'll be a a better execution plan, definitely might be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try running EXPLAIN ANALYZE
!
I set the |
Seeing one subtle bug here where after coming back to the page the children seem to have been lost in the state. subtle_shattering_bug.mp4 |
TS and eslint not updated to standard methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great save for the radix property, which does still allow me to click and remove parent without replacing with children. overall amazing work
Screen.Recording.2024-10-14.at.3.14.47.PM.movI may have spoken too soon - it appears that if you shatter an unassigned block, it is not replaced by unassigned children. We should decide whether we require assignment before splitting or not and address accordingly. My sense is that we would want users to be able to shatter and then assign |
) | ||
op.drop_constraint("districtr_map_uuid_unique", "districtrmap", type_="unique") | ||
op.execute( | ||
sa.text(""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, this is nice, I was worried about our migrations. I like this better than either having to create new files for new versions or do something very convoluted to get prior versions of functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great!
ParentChildEdges.parent_path, | ||
) | ||
.join(Document, Assignments.document_id == Document.document_id) | ||
.join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(q) is this reasonably quick? a little nervous about having two joins just to retrieve assignments. I suppose this is only called on load, though, so it's not like we're doing it repeatedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question. I'm seeing ~250-300ms query execution time, ~500ms round trip locally. For UX, this happens concurrently with tile loading, which usually takes about the same time or slightly longer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I barely noticed a change if any
DistrictrMap, | ||
Document.gerrydb_table == DistrictrMap.gerrydb_table_name, | ||
isouter=True, | ||
) | ||
.limit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(pp) not caused by this PR, but just noticing- limit(1) is going to prevent run time errors if we do for some reason violate uniqueness constraints, but, uh, we have uniqueness constraints on the database. I think we should be consistent-- pair result.one() with a .limit(1) each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally but I'd prefer to actually remove the limit and handle the runtime error, allowing us to send an informative error and catch bugs earlier. Bugs ofc not caused by erroneous db state-because we have our constraints in place-but in our queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With that said, let's address this in a separate mini pr
The shattering UDF has been updated, and this should be resolved. |
To reviewers, cc @mariogiampieri, if you're testing locally you'll need to re-run migrations or use Here's the fix in action: shattering_unassigned.mp4 |
Todo
DistrictrMap
ParentChildEdges
GerryDBTable
: remove tiles path, which should go inDistrictrMap
nowCommand to patch districtr map. Optional but would be nice not to have to useLet's do this in a future PRpsql
Update create document to take districtr map uuid instead of gerrydb table name.Existing endpoint will still worksetFilter
logic after hitting shatter endpointparentchildedges
when sending assignments on page load)districtrmap
records from existinggerrydbtable
recordsDescription
Reviewers
Checklist
Screenshots (if applicable):