Skip to content

Pysmurf monitor Operational Error #712

@jlashner

Description

@jlashner

Pysmurf monitor run method failed with the following error:

2024-08-03T20:17:24+0000 run:0 CRASH: [Failure instance: Traceback: <class 'sqlalchemy.exc.OperationalError'>: (sqlite3.OperationalError) database is locked
[SQL: INSERT INTO supersync_v0 (local_path, local_md5sum, archive_name, remote_path, timestamp, remote_md5sum, copied, removed, failed_copy_attempts, deletable, "ignore") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: ('/data/smurf_data/20240803/ufm_mv19/1722715985/plots/1722716193_asd_summary.png', 'ea6adb9d579f6b56dc995ffe6865f4a8', 'smurf', '17227/ufm_mv19/1722715985_uxm_relock/plots/1722716193_asd_summary.png', 1722716239.4683363, None, None, None, 0, 1, 0)]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
/usr/lib/python3.8/threading.py:932:_bootstrap_inner
/usr/lib/python3.8/threading.py:870:run
/usr/local/lib/python3.8/dist-packages/twisted/_threads/_threadworker.py:49:work
/usr/local/lib/python3.8/dist-packages/twisted/_threads/_team.py:192:doWork
--- <exception caught here> ---
/usr/local/lib/python3.8/dist-packages/twisted/python/threadpool.py:269:inContext
/usr/local/lib/python3.8/dist-packages/twisted/python/threadpool.py:285:<lambda>
/usr/local/lib/python3.8/dist-packages/twisted/python/context.py:117:callWithContext
/usr/local/lib/python3.8/dist-packages/twisted/python/context.py:82:callWithContext
/usr/local/lib/python3.8/dist-packages/ocs/ocs_agent.py:984:_running_wrapper
/usr/local/lib/python3.8/dist-packages/socs/agents/pysmurf_monitor/agent.py:210:run
/usr/lib/python3.8/contextlib.py:120:__exit__
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py:1173:_maker_context_manager
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/util.py:237:__exit__
/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/langhelpers.py:70:__exit__
/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py:207:raise_
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/util.py:233:__exit__
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py:829:commit
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py:808:_prepare_impl
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py:3367:flush
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py:3507:_flush
/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/langhelpers.py:70:__exit__
/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py:207:raise_
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py:3467:_flush
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py:456:execute
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py:630:execute
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py:245:save_obj
/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py:1238:_emit_insert_statements
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py:1631:_execute_20
/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py:325:_execute_on_connection
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py:1498:_execute_clauseelement
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py:1862:_execute_context
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py:2043:_handle_dbapi_exception
/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py:207:raise_
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py:1819:_execute_context
/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py:732:do_execute
]
2024-08-03T20:17:24+0000 run:0 Status is now "done".

We should make the run method robust to this so that it retries with a new db connection on failure, and doesn't lose information for the file that failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions