Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Exit Code 11 for time and size Limit #549

Closed
gitreich opened this issue Apr 18, 2024 · 6 comments
Closed

Docker Exit Code 11 for time and size Limit #549

gitreich opened this issue Apr 18, 2024 · 6 comments

Comments

@gitreich
Copy link
Contributor

Browsertrix-Version 1.0.4
Limit Parameter leads to Docker exit codes 0 ( that's also my expectation )
but Parameters:
--sizeLimit
--timeLimit
leads to Docker Return Code 11

But from my point of view it should be exitCode 0 as Docker ends as expected/configured

@rgaudin
Copy link
Contributor

rgaudin commented Apr 18, 2024

This is intentional so that automated tools have a way to know that the crawl was stopped because of a reached limit. Otherwise, it wouldn't be (easily) possible to know whether the whole thing was under the limit or not.

@gitreich
Copy link
Contributor Author

okay, but why is not --limit also exit code 11? (that is the pageLimit)

@rgaudin
Copy link
Contributor

rgaudin commented Apr 18, 2024

okay, but why is not --limit also exit code 11? (that is the pageLimit)

It should!

@tw4l
Copy link
Member

tw4l commented Apr 18, 2024

I think the rationale was that the page limit is considered part of the scoping of the crawl, i.e. what the user intended to capture, whereas the size and time limits are constraints that might result in the intended scope of the crawl not being reached.

In the Browsertrix Cloud frontend we show a crawl that finished with a page limit as an unqualified success, whereas crawls that finished because they hit the size/time limit are successful but displayed a little differently so users know a limit was hit before the intended scope was reached.

@ikreymer
Copy link
Member

The reason is also with whether or not a crawl can be continued: with --timeLimit and --sizeLimit, the crawl is interrupted, but there are additional page URLs on the queue, so the crawl can be restarted (if save state is enabled, etc..) if desired.
The 11 error code allows kubernetes (and other tools) to restart the crawl, if desired.

With --limit / --pageLimit, the implication is, as @tw4l said, that this is part of the crawl scope, and no additional pages to crawl and there is nothing to continue, so exit code 0 makes sense.

I realize this is slightly confusing - in the future, we could implement a pageLimit that works more like timeLimit and sizeLimit, still keeping URLs in the queue and allowing continuation. This was not done initially mostly for memory reasons, since there are potentially millions of pages that could be queued, and there is no guarantee that a user wants to continue the crawl beyond the initial limit. Hope this helps explain the rationale!

Will close this for now, will make a new issue if we decide to explore a different type of page limit in the future.

@github-project-automation github-project-automation bot moved this from Triage to Done! in Webrecorder Projects Apr 18, 2024
@gitreich
Copy link
Contributor Author

Thank you all for your explonations it's now clear to me
I added it to the Docs via a9803b2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done!
Development

No branches or pull requests

4 participants