Skip to content
This repository has been archived by the owner on Feb 8, 2018. It is now read-only.

Optimize homepage queries #1772

Closed
zbynekwinkler opened this issue Dec 15, 2013 · 14 comments
Closed

Optimize homepage queries #1772

zbynekwinkler opened this issue Dec 15, 2013 · 14 comments

Comments

@zbynekwinkler
Copy link
Contributor

While working on #1549 I might have come up with some optimizations. With following view

    CREATE VIEW current_tips AS
        SELECT * FROM (
                  SELECT DISTINCT ON (tipper, tippee) *
                    FROM tips
                ORDER BY tipper, tippee, mtime DESC
            ) AS foo
        WHERE amount > 0;

following index

create index tips_all on tips(tipper, tippee, mtime DESC);

I can rewrite the query to get top receivers

SELECT tippee AS username, sum(amount) AS amount
     FROM current_tips
     JOIN participants p ON p.username = tipper
    WHERE p.last_bill_result = ''
      AND p.is_suspicious IS NOT true     
      AND p.claimed_time IS NOT null      
      AND EXISTS (SELECT * FROM elsewhere WHERE participant=tippee AND is_locked = false)
 GROUP BY tippee
 ORDER BY amount DESC

which has the advantage of running about 10x faster than the original meaning that the original query takes over 1s on my laptop while the new takes 100ms. Without the index it takes 500ms. There is only one seq scan on participants table left in the query. Also the external on disk merge is gone. I just might be somewhat closer to understanding how postgres works 😉

If I get too hold up with #1549 I might do this instead/before.

@zbynekwinkler
Copy link
Contributor Author

Well, never mind. It does not work on heroku on the production database. It does not use the index when I add it. 😠

@zbynekwinkler
Copy link
Contributor Author

Anyone has got an idea what is going on? \d tips gives me

"tips_all" btree (tipper, tippee, mtime DESC)

and such a simple query

explain analyze 
select distinct on (tipper, tippee) * 
from tips order by tipper, tippee, mtime desc;

has the following plan on heroku

                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Unique  (cost=1203.69..1266.89 rows=4512 width=42) (actual time=523.286..641.211 rows=13576 loops=1)
   ->  Sort  (cost=1203.69..1224.76 rows=42134 width=42) (actual time=523.281..580.208 rows=42134 loops=1)
         Sort Key: tipper, tippee, mtime
         Sort Method: external merge  Disk: 2368kB
         ->  Seq Scan on tips  (cost=0.00..556.40 rows=42134 width=42) (actual time=0.010..41.916 rows=42134 loops=1)
 Total runtime: 654.354 ms
(6 rows)

while it has the following plan on my local import of the gittip database

                                                             QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..3476.73 rows=4544 width=42) (actual time=0.024..31.049 rows=13539 loops=1)
   ->  Index Scan using tips_all on tips  (cost=0.00..3266.59 rows=42028 width=42) (actual time=0.022..19.467 rows=42028 loops=1)
 Total runtime: 31.798 ms
(3 rows)

@zbynekwinkler
Copy link
Contributor Author

The only difference I was able to find is that heroku version is 9.1.9 and my local version is 9.1.11.

@chadwhitacre
Copy link
Contributor

Maybe start a ticket w/ Heroku?

@tshepang
Copy link
Contributor

What would it take to move off Heroku? Seems like a heck of a lot of pain, much more than running Gittip on a VPS.

@zbynekwinkler
Copy link
Contributor Author

Maybe start a ticket w/ Heroku?

Done.

What would it take to move off Heroku? Seems like a heck of a lot of pain, much more than running Gittip on a VPS.

I don't know but @clone1018 wants to discuss it during the retreat. We might add it to the agenda. I get the feeling that heroku abstraction is in a somewhat awkward place - too high so you don't really know what is going on but too low that you cannot just forget about it.

@hgmnz
Copy link

hgmnz commented Dec 18, 2013

Heroku Postgres lead here. I was going to respond on the support ticket, but might as well respond right here with some guidance.

Here's the query in question:

select distinct on (tipper, tippee) * 
from tips order by tipper, tippee, mtime desc;

The way indexes work in postgres: the executor traverses the index and finds all relevant data, from there it grabs all the relevant locations of where those pages are in the heap (on disk) and fetches them.

Because this query is not very selective (in fact, it must fetch all tips), what would you think is faster? To do a sequential scan on disk to retrieve them, or going to an index, finding that all rows match, and then doing random I/O to fetch them from disk? Seq scans are much faster than random I/O.

On Heroku Postgres, we tune your database according to the underlying hardware the database is on. Most relevant here, we set appropriate settings for what the cost is of going to disk for seq scans based on the underlying block devices. This information is used by the planner to determine what plan is believed to be faster during the query planning phase. For this query, a seq scan is faster than an index scan, and I believe it's the correct tradeoff based on the fact that most data must be retrieved.

If an index is used on your local postgres, it means that you haven't tuned postgres to give it enough information to determine that a seq scan is a better choice for this query.

Having a selective predicate would probably change things significantly. If the query was retrieving only tips for a specific tipper or tippee, an index would probably be used, given that the stats are up to date and the given tipper or tippee doesn't have a large portion of records in the tips table: it would have to be a tipper/tippee with relatively few tips, to make paying the cost of random IO worth it to the planner.

Bottom line, don't stress because some query is using a seq scan. Let postgres determine what the best execution path is for the query, it often gets it right. Further, if this query is on your home page, it is probably well cached by now, so you should see results in the ms range.

Hope that helps.

@zbynekwinkler
Copy link
Contributor Author

@hgmnz Thanks for the response. Did you take a look at the 'explain's above? My dev notebook is about the same speed as heroku instance and the difference with and without index is in the range of 30ms to 600ms. That is when all is cached. Cold start is terrible anyway so optimizing for it does not make sense IMO (the size of the whole db is around 180MB so eventually everything is in memory and the disk is just a nonvolatile cache).

How can I setup my local postgres server so that it behaves like the heroku instance? When designing the db schema and the queries and indexes I need to have the baseline locally. Query run time 30ms is feasible but 600ms is not. If heroku is not going to use the index I cannot have that type of query and I need to change the schema. I am not particularly worried about this exact query but about the principle that I cannot design the schema locally and have it work in production.

We are running 9.1 right now but I'll be more interested in 9.3 since we will be upgrading soon ( #1158). What do I need to do to have it behave the same like heroku postgres? I have a default setup on ubuntu (haven't tweaked anything so far). Thanks.

@petergeoghegan
Copy link

I'll re-post what I posted in a support ticket here:

A simpler test-case is as follows:

`=> explain (analyze, verbose,costs, buffers)
select * from tips order by tipper, tippee, mtime desc;

QUERY PLAN

Sort (cost=1206.29..1227.42 rows=42265 width=42) (actual time=724.940..769.345 rows=42279 loops=1)
Output: id, ctime, mtime, tipper, tippee, amount
Sort Key: tips.tipper, tips.tippee, tips.mtime
Sort Method: quicksort Memory: 5898kB
Buffers: shared hit=430
-> Seq Scan on public.tips (cost=0.00..556.79 rows=42265 width=42) (actual time=0.010..68.256 rows=42279 loops=1)
Output: id, ctime, mtime, tipper, tippee, amount
Buffers: shared hit=430
Total runtime: 810.654 ms
(9 rows)

Time: 929.389 ms
=> set enable_seqscan = 'off';
SET
Time: 117.917 ms
=> explain (analyze, verbose,costs, buffers)
select * from tips order by tipper, tippee, mtime desc;

QUERY PLAN

Index Scan using tips_all on public.tips (cost=0.00..1531.09 rows=42265 width=42) (actual time=0.022..57.194 rows=42279 loops=1)
Output: id, ctime, mtime, tipper, tippee, amount
Buffers: shared hit=28463
Total runtime: 95.364 ms
(4 rows)

Time: 214.430 ms`

So the seqscan hits 430 buffers, to the index scan's 28463 buffers. Even though the planner evidently made the wrong decision, you can certainly see why it did so. Even if this was the wrong plan today, it's hard to imagine that it'll continue to be the wrong plan as the table grows.

I'm inclined to think that the problem here is that the planner underestimates the cost of the sort. It fails to appreciate that it'll frequently have to do a full 3-way comparison (i.e. a tie-breaker will be often needed, so 3 comparisons for each of 3 columns through sorting), because of the strong tipper:tipee correlation, which is poorly modeled by the planner's cost model generally. The fact that a different version does better isn't necessarily significant: this version thinks that the index-scan based plan is only a bit better than the seqscan based one, so the difference that makes the planner make the "right choice" is likely to be essentially insignificant. The seqscan is way cheaper than the index scan here, and the seqscan plans's slowness is entirely down to the sort.

You might try CLUSTERing here, to see if the planner sees the physical/logical correlation and estimates the costs as lower, if indeed you really always will want the statistics in this order. Example:

CLUSTER VERBOSE tips USING tips_all; ANALYZE tips;

N.B.: This process takes heavy locks, so be careful about when you do this.

Hope that helps

@zbynekwinkler
Copy link
Contributor Author

Thanks. I will try that. Any idea what to do to my local postgres so that it behaves like herokus? What settings to change?

@petergeoghegan
Copy link

If you want your local Postgres to behave exactly, deterministically consistent with a Heroku Postgres instance, that's not really possible - it isn't even something I'd fully expect from, say, 2 Crane instances. If you changed the settings to match exactly, you might come close (Postgres doesn't introspect about resource usage to make better choices in determining plans), but there still is a random aspect because the statistical sampling that drives planning is seeded with a random value.

@zbynekwinkler
Copy link
Contributor Author

Ok, I'd like to come close 😄. I believe the planner in postgres is deterministic so given the same data and same configuration it should come up with really similar plans (most of the time). The trouble is that I do not know all the knobs that can be turned. So can you help me get together the params I should have in sync for the planner work like that? Is this what I should be looking at or is there something else?

http://www.postgresql.org/docs/9.1/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS

Can I rely on show all at the heroku postgres instance to tell me what I need or is there some other catch?

@hgmnz
Copy link

hgmnz commented Dec 19, 2013

Thanks @petergeoghegan for the more indepth analysis of this query and what's going on here!

@zwn as Peter points out, it's not really possible to get that close, but applying settings from pg_settings (or show all) will give you the parameters used by the planner that you can replicate locally. However, you need the same data as in production, and furthermore the statistics gathered by postgres won't be the same across instances, so keep that in mind.

I want to point out that this is NOT RECOMMENDED EVER - I have brought this point with many people in the community and want to make sure that's clear here as well. Use development for feature dev and testing, use a production-like environment for tuning.

What I would recommend is that if you must do query tuning and you'd like to try things out outside of your production environment, just create a database fork of your database, do whatever, and then destroy it. Creating and destroying forks is cheap (as in easy to do, and also as in we only charge for the number of seconds it is up), and this is precisely one important use case.

Cheers!

@chadwhitacre
Copy link
Contributor

Obsolete with #2544.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants