Skip to content

events.* tables think they're stored after resetting the cache, but Query.get_stored() disagrees #832

@jc-harrison

Description

@jc-harrison

Describe the bug
After a call to flowmachine.core.cache.reset_cache(), Query.get_stored() will return no queries. However, the is_stored() method of the events.{calls,mds,sms} table objects will return True. As a result, these tables will appear in the set returned by queries' _get_stored_dependencies() method, and calling query.store().result() will raise an error:

psycopg2.errors.ForeignKeyViolation: insert or update on table "dependencies" violates foreign key constraint "cache_dependency_id"
DETAIL:  Key (depends_on)=(70a2d392f73aedcc251a6a504f17008a) is not present in table "cached".

To Reproduce

>>> import flowmachine
>>> from flowmachine.features import daily_location
>>> from flowmachine.core.cache import reset_cache
>>> flowmachine.connect()
FlowMachine version: 0+unknown
Flowdb running on: localhost:9000/flowdb (connecting user: flowmachine)
<flowmachine.core.connection.Connection object at 0x11b14b080>

>>> list(flowmachine.core.Query.get_stored())
[]

>>> dl_query = daily_location(date="2016-01-03", level="admin3", method="last")

>>> list(flowmachine.core.Query.get_stored())
[<Table: 'events.calls', query_id: '057addedac04dbeb1dcbbb6b524b43f0'>,
 <Table: 'events.calls', query_id: '70a2d392f73aedcc251a6a504f17008a'>,
 <Table: 'events.calls_20160101', query_id: '41702c7c062a29932c738de26117a12f'>,
 <Table: 'events.calls_20160102', query_id: 'ec4a35b5b695aa67ec3e074949074b7c'>,
 <Table: 'events.calls_20160103', query_id: '425848a6a1113dfd9015eca6fc30f16a'>,
 <Table: 'events.calls_20160104', query_id: 'fd85443cad5fd529536dd36a88ac6a55'>,
 <Table: 'events.calls_20160105', query_id: '50b67ea6b52b9d1e58d40c0ceb2b2d59'>,
 <Table: 'events.calls_20160106', query_id: '7b70d1ade9970da2c4929e643d3ee736'>,
 <Table: 'events.calls_20160107', query_id: 'c18720a0e2bb9db77393ad928071936a'>,
 <Table: 'events.calls_20160108', query_id: '1f449d564be3716cbffedfc04a1593ef'>,
 <Table: 'events.calls_20160109', query_id: 'b3b3188e554f0e0319035f920f9400e4'>,
 <Table: 'events.calls_20160110', query_id: '80747ef52ac47669d0b6ae7011379be8'>,
 <Table: 'events.sms', query_id: '7a7f27978925c385bc44a5ec5667d7b3'>,
 <Table: 'events.sms', query_id: '9de507e882f1fb6b0cfcedd324e27839'>,
 <Table: 'events.sms_20160101', query_id: '01136f2c505733415afa233f26092403'>,
 <Table: 'events.sms_20160102', query_id: '5a52bf5e64fa3ab2c6e876eea645fdf7'>,
 <Table: 'events.sms_20160103', query_id: 'b2691ce1659275ea127bc8ebdc8207f0'>,
 <Table: 'events.sms_20160104', query_id: '27604f440068e8734f92c051e21cc740'>,
 <Table: 'events.sms_20160105', query_id: '7701809cf8e83c82eef5deb0a94cf5f5'>,
 <Table: 'events.sms_20160106', query_id: '5f1741baac4544a30052ed8c8ca3ae66'>,
 <Table: 'events.sms_20160107', query_id: '7f42676c72c12ae61995be405b1d4bed'>,
 <Table: 'events.mds', query_id: '64ba935fee023d48dad2ec1d41ccc2e0'>,
 <Table: 'events.mds', query_id: 'a7a51b1e9c9cabb84a525ac6510ea612'>,
 <Table: 'events.mds_20160101', query_id: 'c4a046663c753982886e60e213e4e986'>,
 <Table: 'events.mds_20160102', query_id: '226c07e80786031459c72df4d067658c'>,
 <Table: 'events.mds_20160103', query_id: '5ba4dbdcd00c35e121a2e18bdc0a7230'>,
 <Table: 'events.mds_20160104', query_id: 'b9477c988b924edad8da7520b55b9522'>,
 <Table: 'events.mds_20160105', query_id: '4e316cccebfad0b2fc8e5fc1081352eb'>,
 <Table: 'events.mds_20160106', query_id: '3c987b555205ff2bf994722788baaddc'>,
 <Table: 'events.mds_20160107', query_id: '88d36ee986a16b14f5f9fd8c87554ece'>]

>>> dl_query._get_stored_dependencies()
{<Table: 'events.calls', query_id: '70a2d392f73aedcc251a6a504f17008a'>,
 <Table: 'events.mds', query_id: 'a7a51b1e9c9cabb84a525ac6510ea612'>,
 <Table: 'events.sms', query_id: '9de507e882f1fb6b0cfcedd324e27839'>}

>>> reset_cache(flowmachine.core.Query.connection, flowmachine.core.Query.redis)
>>> list(flowmachine.core.Query.get_stored())
[]
>>> dl_query._get_stored_dependencies()
{<Table: 'events.calls', query_id: '70a2d392f73aedcc251a6a504f17008a'>,
 <Table: 'events.mds', query_id: 'a7a51b1e9c9cabb84a525ac6510ea612'>,
 <Table: 'events.sms', query_id: '9de507e882f1fb6b0cfcedd324e27839'>}

So Query.get_stored() doesn't include the events tables any more, but dl_query._get_stored_dependencies() still does. If we now try to store the daily location query, we get an error:

>>> dl_query.store().result()
Traceback (most recent call last):
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.ForeignKeyViolation: insert or update on table "dependencies" violates foreign key constraint "cache_dependency_id"
DETAIL:  Key (depends_on)=(70a2d392f73aedcc251a6a504f17008a) is not present in table "cached".


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jamesharrison/.pyenv/versions/3.7.0/lib/python3.7/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/Users/jamesharrison/.pyenv/versions/3.7.0/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/Users/jamesharrison/.pyenv/versions/3.7.0/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/flowmachine/core/cache.py", line 113, in write_query_to_cache
    write_cache_metadata(connection, query, compute_time=plan_time)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/flowmachine/core/cache.py", line 193, in write_cache_metadata
    (query.md5, dep.md5),
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2166, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return self._execute_text(object_, multiparams, params)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1155, in _execute_text
    parameters,
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
    raise value.with_traceback(tb)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "dependencies" violates foreign key constraint "cache_dependency_id"
DETAIL:  Key (depends_on)=(70a2d392f73aedcc251a6a504f17008a) is not present in table "cached".

[SQL: INSERT INTO cache.dependencies values (%s, %s) ON CONFLICT DO NOTHING]
[parameters: ('ea5f29df58ec0411a640cec967840913', '70a2d392f73aedcc251a6a504f17008a')]
(Background on this error at: http://sqlalche.me/e/gkpj)

Expected behavior
Presumably we don't want to remove the events.{calls,mds,sms} tables when resetting the cache, so cache.cached should still know about these tables.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FlowMachineIssues related to FlowMachinebugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions