Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce excessive hstore queries #142

Closed
nvkelso opened this issue Jan 4, 2017 · 7 comments
Closed

Reduce excessive hstore queries #142

nvkelso opened this issue Jan 4, 2017 · 7 comments
Assignees

Comments

@nvkelso
Copy link
Member

nvkelso commented Jan 4, 2017

When profiling tile generation with New Relic we found that we're making 2 hstore related requests when we should just be making 1. Each DB request slows down tile generation, less would be better!

  1. is there a simple config change to cache the first requester?
  2. If not, can we remove this altogether?

Complication? Right now we have two different hstore queries related to OpenStreetMap and Who's On First (those probably need to be standardized).

@iandees
Copy link
Member

iandees commented Jan 13, 2017

There are two requests coming from the psycopg2.extras package's HstoreAdapter for every connection we make. psycopg2 needs to compute these OIDs for each connection because every database it connects to could have a different OID for the hstore type. The OID will be different depending on when the hstore extension was added to the database. There are a couple options we could use to remove/reduce these requests:

  1. Compute the OID once and cache it. We can pass these pre-computed OIDs into the oid and array_oid parameter when calling register_hstore().
  2. Combine this with more efficient use of a psycopg2 connection pool so that fewer new connections are made, resulting in fewer register_hstore() (and register_json()) calls.

I'd tend towards option 2 because it's less hacky and would also reduce the postgres connection overhead (and solve #141 at the same time).

@rmarianski
Copy link
Member

I'd tend towards option 2 because it's less hacky and would also reduce the postgres connection overhead (and solve #141 at the same time).

👍

Complication? Right now we have two different hstore queries related to OpenStreetMap and Who's On First (those probably need to be standardized).

I misspoke earlier when I mentioned this. The way we transport the hstore columns is the same for these tables (via %#tags AS tags), but it's different for the mz_properties that we encode as json. We should probably have the driver handle both of them consistently.

On a related note, we should ensure that these both either return back utf-8 encoded strings or unicode. I think going through the driver decodes to unicode, otherwise we get them back as utf-8. At some point the downstream code assumed utf-8 (iirc some of the formatters?) so the code ensures that they are utf-8 now for mz_properties, but if we make changes here we'll need to be consistent. I'd actually prefer that we decoded to unicode asap in the pipeline, ie right when we got the results from sql, but that might be a bigger change than we'd like to make at the moment in terms of ripple effects downstream.

@iandees
Copy link
Member

iandees commented Jan 13, 2017

Yea, I was surprised to see that the default for register_hstore() on Python 2 is to return str objects instead of unicode in the resulting dicts. I bet converting the hstore to json first results in unicode objects, so if we change that flow we'll need to keep an eye on that.

@nvkelso
Copy link
Member Author

nvkelso commented Jan 13, 2017

Related: tilezen/vector-datasource#1130, which links to tilezen/vector-datasource#1129.

@iandees iandees self-assigned this Jan 17, 2017
@iandees
Copy link
Member

iandees commented Jan 17, 2017

Looking through the connection pool stuff with Rob, it looks like we can call register_json and register_hstore setting the globally argument to True. Then every connection that happens with psycopg2 will have those types registered. Also, we probably don't need to call register_hstore because we only interact with the json type.

@nvkelso
Copy link
Member Author

nvkelso commented Jan 24, 2017

@iandees Has this been completed via #145?

@iandees
Copy link
Member

iandees commented Jan 24, 2017

Sure, we can close this. New Relic isn't giving me the stats that would confirm this, but our changes in 145 will help for sure.

@iandees iandees closed this as completed Jan 24, 2017
@iandees iandees removed the ready label Jan 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants