Skip to content

system: improve site responsiveness when PostgreSQL service is down #2119

Open
@tiborsimko

Description

@tiborsimko

COD3 should behave nicely when PostgreSQL service is down. (regardless of our aggressive caching)

Here are some findings:

web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1465, in _handle_dbapi_exception_noconnection
web_1            |     exc_info
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
web_1            |     reraise(type(exception), exception, tb=exc_tb, cause=cause)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2147, in _wrap_pool_connect
web_1            |     return fn()
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 387, in connect
web_1            |     return _ConnectionFairy._checkout(self)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 766, in _checkout
web_1            |     fairy = _ConnectionRecord.checkout(pool)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 516, in checkout
web_1            |     rec = pool._do_get()
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1138, in _do_get
web_1            |     self._dec_overflow()
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
web_1            |     compat.reraise(exc_type, exc_value, exc_tb)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1135, in _do_get
web_1            |     return self._create_connection()
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 333, in _create_connection
web_1            |     return _ConnectionRecord(self)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 461, in __init__
web_1            |     self.__connect(first_connect_check=True)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 651, in __connect
nginx_1          | 2017/12/13 16:47:19 [error] 7#7: *53 upstream prematurely closed connection while reading response header from upstream, client: 172.19.0.1, server: _, request: "GET /record/1 HTTP/1.1", upstream: "http://172.19.0.6:5000/record/1", host: "0.0.0.0"
nginx_1          | 172.19.0.1 - - [13/Dec/2017:16:47:19 +0000] "GET /record/1 HTTP/1.1" 502 166 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0" "-"
web_1            |     connection = pool._invoke_creator(self)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 105, in connect
web_1            |     return dialect.connect(*cargs, **cparams)
web_1            |   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 393, in connect
web_1            |     return self.dbapi.connect(*cargs, **cparams)
web_1            |   File "/usr/lib64/python2.7/site-packages/psycopg2/__init__.py", line 130, in connect
web_1            |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
web_1            | sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not translate host name "postgresql" to address: Name or service not known

The behaviour is good overall, but we could do better in returning a nicer user-friendly message explaining that records cannot be currently viewed and that we are working on it...

How to test:

$ firefox http://0.0.0.0/
$ docker-compose stop postgresql
$ docker exec opendatacernch_nginx_1 find /var/cache/nginx/cache_cod_global -type f -delete
$ firefox http://0.0.0.0/

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions