-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker Exit Code 11 for time and size Limit #549
Comments
This is intentional so that automated tools have a way to know that the crawl was stopped because of a reached limit. Otherwise, it wouldn't be (easily) possible to know whether the whole thing was under the limit or not. |
okay, but why is not --limit also exit code 11? (that is the pageLimit) |
It should! |
I think the rationale was that the page limit is considered part of the scoping of the crawl, i.e. what the user intended to capture, whereas the size and time limits are constraints that might result in the intended scope of the crawl not being reached. In the Browsertrix Cloud frontend we show a crawl that finished with a page limit as an unqualified success, whereas crawls that finished because they hit the size/time limit are successful but displayed a little differently so users know a limit was hit before the intended scope was reached. |
The reason is also with whether or not a crawl can be continued: with --timeLimit and --sizeLimit, the crawl is interrupted, but there are additional page URLs on the queue, so the crawl can be restarted (if save state is enabled, etc..) if desired. With --limit / --pageLimit, the implication is, as @tw4l said, that this is part of the crawl scope, and no additional pages to crawl and there is nothing to continue, so exit code 0 makes sense. I realize this is slightly confusing - in the future, we could implement a pageLimit that works more like timeLimit and sizeLimit, still keeping URLs in the queue and allowing continuation. This was not done initially mostly for memory reasons, since there are potentially millions of pages that could be queued, and there is no guarantee that a user wants to continue the crawl beyond the initial limit. Hope this helps explain the rationale! Will close this for now, will make a new issue if we decide to explore a different type of page limit in the future. |
Thank you all for your explonations it's now clear to me |
Browsertrix-Version 1.0.4
Limit Parameter leads to Docker exit codes 0 ( that's also my expectation )
but Parameters:
--sizeLimit
--timeLimit
leads to Docker Return Code 11
But from my point of view it should be exitCode 0 as Docker ends as expected/configured
The text was updated successfully, but these errors were encountered: