Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent sqlalchemy Transparent SQL Compilation Caching from filling up during purge #71015

Merged
merged 17 commits into from
Apr 29, 2022

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Apr 28, 2022

Proposed change

Since each purge cycle and can result in a many different sized
unioned queries to find unused attribute ids, sqlalchemys
Transparent SQL Compilation Caching would be a bit too helpful
and cache each one which results in an extra 500MB of memory used.

Schermafbeelding 2022-04-28 om 07 52 03

Fixes the memory jump reported here and analyzed here

We now generate a single query using sqlalchemy's lambda_stmt
and fill unused values with NULLs to ensure that only one query is
cached and we only have to compile the query once.

Because of how lambda_stmt works, I had to write out the entire
statement since it relies on getting the bind values being passed
to the function as python arguments:
https://docs.sqlalchemy.org/en/14/core/connections.html#quick-guidelines-for-lambdas

If there is any way to improve this without having to write it out,
I haven't been able to find it after beating my head against the wall
for hours ( I have 12+ hours into this PR ). If anyone has a better way, please help!

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

The integration reached or maintains the following Integration Quality Scale:

  • No score or internal
  • 🥈 Silver
  • 🥇 Gold
  • 🏆 Platinum

To help with the load of incoming pull requests:

…p during purge

Since each purge cycle and can result in a different amount
unioned queries to find unused attribute ids, sqlalchemys
Transparent SQL Compilation Caching would cache each one.

We now generate a single query and fill unused values with NULLs
to ensure that only one query is cached and we only have to
compile the query once.
@probot-home-assistant
Copy link

Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration (recorder) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)

@bdraco bdraco added this to the 2022.5.0 milestone Apr 28, 2022
@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2022

Well it doesn't fill up the cache but now we don't cache that statement at all which is a cpu problem

@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2022

this is taking forever to test since I have to generate a database that takes long enough to purge and watch the memory, cpu, and profile during the purge

@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2022

Latest push is performing well
Screen Shot 2022-04-28 at 14 28 17

@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2022

Memory is now stable during purge

@@ -112,6 +114,118 @@ def _select_event_state_and_attributes_ids_to_purge(
return event_ids, state_ids, attributes_ids


def _generate_find_attr_lambda(attribute_ids: list[int]) -> StatementLambdaElement:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works and its fast, but I need to figure out how to avoid writing it all out

@balloob balloob changed the title Prevent sqlalchemy Transparent SQL Compilation Caching from filling up during pruge Prevent sqlalchemy Transparent SQL Compilation Caching from filling up during purge Apr 28, 2022
@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2022

purge_cache30

This works really well and doesn't have the memory problem

@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2022

I haven't been able to come up with a solution that doesn't involve writing out the whole query.

I tried generating it an using params but it was at last an order of magnitude slower

@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2022

This might be one of those cases where is better to bypass the ORM and generate the query ourselves but I guess it better to have a written out query than try to deal with the complexity of all the supported DBMS ourselves


https://docs.sqlalchemy.org/en/14/core/connections.html#quick-guidelines-for-lambdas
"""
return lambda_stmt(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda_stmt is actually looking at the code inside the lambda to generate the cache key so it seems like we are stuck writing this out if we want a single cache key with bindparams

@bdraco bdraco marked this pull request as ready for review April 29, 2022 01:08
@bdraco bdraco requested a review from a team as a code owner April 29, 2022 01:08
return select(func.min(States.attributes_id)).where(States.attributes_id == attr)


def _generate_find_attr_lambda(
Copy link
Member Author

@bdraco bdraco Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs for sqlalchemy say

Avoid referring to non-SQL constructs inside of lambdas as they are not cacheable by default
....
The best way to resolve the above situation is to not refer to foo inside of the lambda, and refer to it outside instead:

>>> def my_stmt(foo):
...     x_param, y_param = foo.x, foo.y
...     stmt = lambda_stmt(lambda: select(func.max(x_param, y_param)))
...     return stmt

So even if I pass this as a list I'm stuck unpacking them as attr1, attr2, attr3, attr4, .... = attrs outside the lambda

@bdraco
Copy link
Member Author

bdraco commented Apr 29, 2022

Memory testing looks solid. Actually saw memory going down from gc during the purge

@bdraco
Copy link
Member Author

bdraco commented Apr 29, 2022

Tested with timescale db / postgres addon and purge is still fast + memory looks good

@bdraco
Copy link
Member Author

bdraco commented Apr 29, 2022

cpu during purge also looks good

@bdraco
Copy link
Member Author

bdraco commented Apr 29, 2022

I don't love the syntax here, but I can't find a better way that doesn't involve using sqlalchemy internals which seems risky. I just don't like how the code looks even if though it functions well.

to_remove = attributes_ids - {state[0] for state in id_query.all()}
# We used to generate a query based on how many attribute_ids to find but
# that meant sqlalchemy Transparent SQL Compilation Caching was working against
# us by cached up to MAX_ROWS_TO_PURGE different statements which could be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar with SQL Alchemy, but would it be possible to execute a query without it being cached ? We don't purge that often, so having slightly more expensive query generation is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you wrote that above already, dealing with the dialects. That's fair, we can keep it as this. People will have questions when they see it 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought was to disable the cache.

Unfortunately the query generates so many ORM objects (also the same reason the cache gets so big), generating it every time was more expensive run-time wise than all the other python code run in a purge cycle combined.

Sqlalchemy clearly isn't optimized for this use case.

I also tried caching a query with many bindparms but the code that generates the actual query from the cache clones the cached version and visits every part to fill in the bind values so that turned out to be even more expensive.

The only way I found that resulted in a single cache key (the lambda) was how it's currently implemented.

@balloob balloob merged commit b9c7a89 into home-assistant:dev Apr 29, 2022
balloob pushed a commit that referenced this pull request Apr 29, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Apr 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto_purge from database takes much longer then before 2022.4.x
3 participants