Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of missing and stray log files in Bluefors Agent #175

Open
BrianJKoopman opened this issue Jun 3, 2021 · 0 comments
Open

Handling of missing and stray log files in Bluefors Agent #175

BrianJKoopman opened this issue Jun 3, 2021 · 0 comments
Labels
agent: bluefors bug Something isn't working
Milestone

Comments

@BrianJKoopman
Copy link
Member

While this somewhat relates to ongoing development in #174, the underlying issues should still be fixed, either in that PR or a subsequent one.

I'll start with this log segment:

2021-06-03T05:20:00+0000 /logs/21-06-03/maxigauge 21-06-03.sync-conflict-20210603-011948-I2BQPMU.log not yet open, opening...
2021-06-03T05:20:00+0000 /logs/21-06-03/Flowmeter 21-06-03.sync-conflict-20210603-011958-I2BQPMU.log not yet open, opening...
2021-06-03T05:20:00+0000 Not publishing stale data. Make sure your log file sync is done at a rate faster than once ever 2 minutes.
2021-06-03T05:20:00+0000 Not publishing stale data. Make sure your log file sync is done at a rate faster than once ever 2 minutes.
2021-06-03T05:20:00+0000 Not publishing stale data. Make sure your log file sync is done at a rate faster than once ever 2 minutes.
...
...many more of these...
...
2021-06-03T05:20:03+0000 Not publishing stale data. Make sure your log file sync is done at a rate faster than once ever 2 minutes.
2021-06-03T05:20:03+0000 Not publishing stale data. Make sure your log file sync is done at a rate faster than once ever 2 minutes.
2021-06-03T09:00:59+0000 acq:0 Crash in thread: [Failure instance: Traceback: <class 'FileNotFoundError'>: [Errno 2] No such file or directory: '/logs/21-06-03/Flowmeter 21-06-03.log'
/usr/lib/python3.8/threading.py:932:_bootstrap_inner
/usr/lib/python3.8/threading.py:870:run
/usr/local/lib/python3.8/dist-packages/twisted/_threads/_threadworker.py:47:work
/usr/local/lib/python3.8/dist-packages/twisted/_threads/_team.py:181:doWork
--- <exception caught here> ---
/usr/local/lib/python3.8/dist-packages/twisted/python/threadpool.py:238:inContext
/usr/local/lib/python3.8/dist-packages/twisted/python/threadpool.py:254:<lambda>
/usr/local/lib/python3.8/dist-packages/twisted/python/context.py:118:callWithContext
/usr/local/lib/python3.8/dist-packages/twisted/python/context.py:83:callWithContext
bluefors_log_tracker.py:488:start_acq
bluefors_log_tracker.py:367:read_and_publish_logs
]
2021-06-03T09:00:59+0000 acq:0 Status is now "done".

There are two problems here:

  1. The Agent matches files that we don't want to open (the sync-conflict files, a product of a syncing software in use in the Cornell setup)
  2. The network file share, we believe, disappears briefly at some point, and the file we want to open goes missing. The monitoring process crashes when this happens.

The solutions would be to restrict the matching of filenames a bit more, and to catch and handle this exception properly, probably by logging a warning or error and trying again on the next loop.

@BrianJKoopman BrianJKoopman added the bug Something isn't working label Jun 3, 2021
@BrianJKoopman BrianJKoopman self-assigned this Jun 3, 2021
@BrianJKoopman BrianJKoopman added this to the v0.2.1 milestone Jun 3, 2021
@BrianJKoopman BrianJKoopman removed their assignment Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent: bluefors bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant