-
-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research using CTEs for faster facet counts #1259
Comments
https://sqlite.org/lang_with.html
It looks like this optimization is completely unavailable on SQLite prior to 3.35.0 (released 12th March 2021). But I could still rewrite the faceting to work in this way, using the exact same SQL - it would just be significantly faster on 3.35.0+ (assuming it's actually faster in practice - would need to benchmark). |
I wonder if I could optimize facet suggestion in the same way? One challenge: the query time limit will apply to the full CTE query, not to the individual columns. |
If all of the facets were being calculated in a single query, I'd be willing to bump the facet time limit up to something a lot higher, maybe even a full second. There's a chance that could work amazingly well with a materialized CTE. |
CTEs were added in 2014-02-03 SQLite 3.8.3 - so I think it's OK to depend on them for Datasette. |
https://www.sqlite.org/changes.html#version_3_35_0
If a CTE creates a table that is used multiple time in that query, SQLite will now default to creating a materialized table for the duration of that query.
This could be a big performance boost when applying faceting multiple times against the same query. Consider this example query:
https://global-power-plants.datasettes.com/global-power-plants?sql=WITH+data+as+%28%0D%0A++select%0D%0A++++*%0D%0A++from%0D%0A++++%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Acountry_long+as+%28select+%0D%0A++%27country_long%27+as+col%2C+country_long+as+value%2C+count%28*%29+as+c+from+data+group+by+country_long%0D%0A++order+by+c+desc+limit+10%0D%0A%29%2C%0D%0Aprimary_fuel+as+%28%0D%0Aselect%0D%0A++%27primary_fuel%27+as+col%2C+primary_fuel+as+value%2C+count%28*%29+as+c+from+data+group+by+primary_fuel%0D%0A++order+by+c+desc+limit+10%0D%0A%29%0D%0Aselect+*+from+primary_fuel+union+select+*+from+country_long+order+by+col%2C+c+desc
Outputs:
The text was updated successfully, but these errors were encountered: