-
Notifications
You must be signed in to change notification settings - Fork 308
Optimize homepage queries #1772
Comments
Well, never mind. It does not work on heroku on the production database. It does not use the index when I add it. 😠 |
Anyone has got an idea what is going on?
and such a simple query explain analyze
select distinct on (tipper, tippee) *
from tips order by tipper, tippee, mtime desc; has the following plan on heroku
while it has the following plan on my local import of the gittip database
|
The only difference I was able to find is that heroku version is 9.1.9 and my local version is 9.1.11. |
Maybe start a ticket w/ Heroku? |
What would it take to move off Heroku? Seems like a heck of a lot of pain, much more than running Gittip on a VPS. |
Done.
I don't know but @clone1018 wants to discuss it during the retreat. We might add it to the agenda. I get the feeling that heroku abstraction is in a somewhat awkward place - too high so you don't really know what is going on but too low that you cannot just forget about it. |
Heroku Postgres lead here. I was going to respond on the support ticket, but might as well respond right here with some guidance. Here's the query in question: select distinct on (tipper, tippee) *
from tips order by tipper, tippee, mtime desc; The way indexes work in postgres: the executor traverses the index and finds all relevant data, from there it grabs all the relevant locations of where those pages are in the heap (on disk) and fetches them. Because this query is not very selective (in fact, it must fetch all tips), what would you think is faster? To do a sequential scan on disk to retrieve them, or going to an index, finding that all rows match, and then doing random I/O to fetch them from disk? Seq scans are much faster than random I/O. On Heroku Postgres, we tune your database according to the underlying hardware the database is on. Most relevant here, we set appropriate settings for what the cost is of going to disk for seq scans based on the underlying block devices. This information is used by the planner to determine what plan is believed to be faster during the query planning phase. For this query, a seq scan is faster than an index scan, and I believe it's the correct tradeoff based on the fact that most data must be retrieved. If an index is used on your local postgres, it means that you haven't tuned postgres to give it enough information to determine that a seq scan is a better choice for this query. Having a selective predicate would probably change things significantly. If the query was retrieving only tips for a specific tipper or tippee, an index would probably be used, given that the stats are up to date and the given tipper or tippee doesn't have a large portion of records in the tips table: it would have to be a tipper/tippee with relatively few tips, to make paying the cost of random IO worth it to the planner. Bottom line, don't stress because some query is using a seq scan. Let postgres determine what the best execution path is for the query, it often gets it right. Further, if this query is on your home page, it is probably well cached by now, so you should see results in the ms range. Hope that helps. |
@hgmnz Thanks for the response. Did you take a look at the 'explain's above? My dev notebook is about the same speed as heroku instance and the difference with and without index is in the range of 30ms to 600ms. That is when all is cached. Cold start is terrible anyway so optimizing for it does not make sense IMO (the size of the whole db is around 180MB so eventually everything is in memory and the disk is just a nonvolatile cache). How can I setup my local postgres server so that it behaves like the heroku instance? When designing the db schema and the queries and indexes I need to have the baseline locally. Query run time 30ms is feasible but 600ms is not. If heroku is not going to use the index I cannot have that type of query and I need to change the schema. I am not particularly worried about this exact query but about the principle that I cannot design the schema locally and have it work in production. We are running 9.1 right now but I'll be more interested in 9.3 since we will be upgrading soon ( #1158). What do I need to do to have it behave the same like heroku postgres? I have a default setup on ubuntu (haven't tweaked anything so far). Thanks. |
I'll re-post what I posted in a support ticket here: A simpler test-case is as follows: `=> explain (analyze, verbose,costs, buffers) QUERY PLAN Sort (cost=1206.29..1227.42 rows=42265 width=42) (actual time=724.940..769.345 rows=42279 loops=1) Time: 929.389 ms QUERY PLAN Index Scan using tips_all on public.tips (cost=0.00..1531.09 rows=42265 width=42) (actual time=0.022..57.194 rows=42279 loops=1) Time: 214.430 ms` So the seqscan hits 430 buffers, to the index scan's 28463 buffers. Even though the planner evidently made the wrong decision, you can certainly see why it did so. Even if this was the wrong plan today, it's hard to imagine that it'll continue to be the wrong plan as the table grows. I'm inclined to think that the problem here is that the planner underestimates the cost of the sort. It fails to appreciate that it'll frequently have to do a full 3-way comparison (i.e. a tie-breaker will be often needed, so 3 comparisons for each of 3 columns through sorting), because of the strong tipper:tipee correlation, which is poorly modeled by the planner's cost model generally. The fact that a different version does better isn't necessarily significant: this version thinks that the index-scan based plan is only a bit better than the seqscan based one, so the difference that makes the planner make the "right choice" is likely to be essentially insignificant. The seqscan is way cheaper than the index scan here, and the seqscan plans's slowness is entirely down to the sort. You might try CLUSTERing here, to see if the planner sees the physical/logical correlation and estimates the costs as lower, if indeed you really always will want the statistics in this order. Example:
N.B.: This process takes heavy locks, so be careful about when you do this. Hope that helps |
Thanks. I will try that. Any idea what to do to my local postgres so that it behaves like herokus? What settings to change? |
If you want your local Postgres to behave exactly, deterministically consistent with a Heroku Postgres instance, that's not really possible - it isn't even something I'd fully expect from, say, 2 Crane instances. If you changed the settings to match exactly, you might come close (Postgres doesn't introspect about resource usage to make better choices in determining plans), but there still is a random aspect because the statistical sampling that drives planning is seeded with a random value. |
Ok, I'd like to come close 😄. I believe the planner in postgres is deterministic so given the same data and same configuration it should come up with really similar plans (most of the time). The trouble is that I do not know all the knobs that can be turned. So can you help me get together the params I should have in sync for the planner work like that? Is this what I should be looking at or is there something else? http://www.postgresql.org/docs/9.1/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS Can I rely on |
Thanks @petergeoghegan for the more indepth analysis of this query and what's going on here! @zwn as Peter points out, it's not really possible to get that close, but applying settings from I want to point out that this is NOT RECOMMENDED EVER - I have brought this point with many people in the community and want to make sure that's clear here as well. Use development for feature dev and testing, use a production-like environment for tuning. What I would recommend is that if you must do query tuning and you'd like to try things out outside of your production environment, just create a database fork of your database, do whatever, and then destroy it. Creating and destroying forks is cheap (as in easy to do, and also as in we only charge for the number of seconds it is up), and this is precisely one important use case. Cheers! |
Obsolete with #2544. |
While working on #1549 I might have come up with some optimizations. With following view
following index
I can rewrite the query to get top receivers
which has the advantage of running about 10x faster than the original meaning that the original query takes over 1s on my laptop while the new takes 100ms. Without the index it takes 500ms. There is only one seq scan on participants table left in the query. Also the external on disk merge is gone. I just might be somewhat closer to understanding how postgres works 😉
If I get too hold up with #1549 I might do this instead/before.
The text was updated successfully, but these errors were encountered: