Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent problems with log rotation #426

Open
anjackson opened this issue Aug 6, 2021 · 0 comments
Open

Intermittent problems with log rotation #426

anjackson opened this issue Aug 6, 2021 · 0 comments
Labels

Comments

@anjackson
Copy link
Collaborator

anjackson commented Aug 6, 2021

In production, outputting log files to Gluster, we occasionally see problems with log file rotation that lead to checkpoint failure (as mentioned in #392).

SEVERE: org.archive.crawler.framework.CheckpointService checkpointFailed  Checkpoint failed [Fri May 28 10:28:03 GMT 2021]
java.io.IOException: Unable to move /heritrix/output/frequent-npld/20210519154706/logs/crawl.log to /heritrix/output/frequent-npld/20210519154706/logs/crawl.log.cp00032-20210528102802
        at org.archive.io.GenerationFileHandler.rotate(GenerationFileHandler.java:127)
        at org.archive.crawler.reporting.BufferedCrawlerLoggerModule.rotateLogFiles(BufferedCrawlerLoggerModule.java:331)
        at org.archive.crawler.reporting.BufferedCrawlerLoggerModule.doCheckpoint(BufferedCrawlerLoggerModule.java:393)
        at org.archive.crawler.framework.CheckpointService.requestCrawlCheckpoint(CheckpointService.java:285)
...

Prior to the checkpoint failure, there are missing log files, e.g. no crawl.log / alerts.log. Instead, usually, the checkpoint-version of the file is still being written to, e.g. crawl.log.cp00011-xxxxx. In rare cases, the underlying Java logger FileHandler rotation appears to have kicked in, because we found a crawl.log.1 file that was being written to.

This is presumably some kind of threading/race-condition in GenerationalFileHandler, perhaps brought on by Gluster occasionally blocking when performing file operations. Unfortunately, this is just a guess, and I don't know how to reproduce this error.

@anjackson anjackson added the bug label Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant