Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with DB disconnections #6

Open
rcoup opened this issue Jan 12, 2018 · 2 comments
Open

Dealing with DB disconnections #6

rcoup opened this issue Jan 12, 2018 · 2 comments

Comments

@rcoup
Copy link

rcoup commented Jan 12, 2018

We use this backed onto a Multi-AZ PostgreSQL on RDS. When the DB fails over during updates/etc, errbot dies.

(psycopg2.OperationalError) SSL connection has been closed unexpectedly
[SQL: 'SELECT core.key AS core_key, core.value AS core_value \nFROM core \nWHERE core.key = %(key_1)s'] [parameters: {'key_1': 'bl_plugins'}] (Background on this error at: http://sqlalche.me/e/e3q8)

Which is ok, there's one error when the PG instance is switched. But it never reconnects, and every command produces output like:

(sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back [SQL: 'SELECT x.key AS x_key, x.value AS x_value \nFROM x \nWHERE x.key = %(key_1)s'] [parameters: [{}]]

As far as I can tell, every method in SQLStorage that access the DB should be wrapped in a session transaction (not just the methods that write to the DB), as discussed in the SQLAlchemy docs. No idea why @session_scope isn't part of SQLAlchemy, but implementing that would work I think?

Might also be worth adding pool_size=1, pool_recycle=300 or something to the default engine too as a bit of added resiliency.

@oz-linden
Copy link
Contributor

Has anyone looked into either fixing this or some workaround that prevents it?
It's hitting my bot pretty hard.

oz-linden added a commit to lindenlab/err-storage-sql that referenced this issue May 11, 2018
    * Wrap all database operations (including reads) in an SQLAlchemy transaction
      per http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#when-do-i-construct-a-session-when-do-i-commit-it-and-when-do-i-close-it
    * Add configuration parameters to set the pool_recycle and pool_pre_ping engine parameters
      defaults are to recycle in 30 minutes and to always ping
@oz-linden
Copy link
Contributor

I've got an experimental patch for this at
https://github.com/lindenlab/err-storage-sql/tree/oz_6_mysql_disconn

I've incorporated it into my bot; it runs in AWS and was hitting this about once a week, so it'll be a couple of weeks before I can be sure that the fix is good, but of course comments in the mean time are most welcome.

gbin pushed a commit that referenced this issue May 18, 2018
* Wrap all database operations (including reads) in an SQLAlchemy transaction
      per http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#when-do-i-construct-a-session-when-do-i-commit-it-and-when-do-i-close-it
    * Add configuration parameters to set the pool_recycle and pool_pre_ping engine parameters
      defaults are to recycle in 30 minutes and to always ping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants