Docker Exit Code 11 for time and size Limit #549

gitreich · 2024-04-18T11:04:53Z

Browsertrix-Version 1.0.4
Limit Parameter leads to Docker exit codes 0 ( that's also my expectation )
but Parameters:
--sizeLimit
--timeLimit
leads to Docker Return Code 11

But from my point of view it should be exitCode 0 as Docker ends as expected/configured

rgaudin · 2024-04-18T11:13:06Z

This is intentional so that automated tools have a way to know that the crawl was stopped because of a reached limit. Otherwise, it wouldn't be (easily) possible to know whether the whole thing was under the limit or not.

gitreich · 2024-04-18T11:22:38Z

okay, but why is not --limit also exit code 11? (that is the pageLimit)

rgaudin · 2024-04-18T11:37:48Z

okay, but why is not --limit also exit code 11? (that is the pageLimit)

It should!

tw4l · 2024-04-18T14:10:16Z

I think the rationale was that the page limit is considered part of the scoping of the crawl, i.e. what the user intended to capture, whereas the size and time limits are constraints that might result in the intended scope of the crawl not being reached.

In the Browsertrix Cloud frontend we show a crawl that finished with a page limit as an unqualified success, whereas crawls that finished because they hit the size/time limit are successful but displayed a little differently so users know a limit was hit before the intended scope was reached.

ikreymer · 2024-04-18T18:47:20Z

The reason is also with whether or not a crawl can be continued: with --timeLimit and --sizeLimit, the crawl is interrupted, but there are additional page URLs on the queue, so the crawl can be restarted (if save state is enabled, etc..) if desired.
The 11 error code allows kubernetes (and other tools) to restart the crawl, if desired.

With --limit / --pageLimit, the implication is, as @tw4l said, that this is part of the crawl scope, and no additional pages to crawl and there is nothing to continue, so exit code 0 makes sense.

I realize this is slightly confusing - in the future, we could implement a pageLimit that works more like timeLimit and sizeLimit, still keeping URLs in the queue and allowing continuation. This was not done initially mostly for memory reasons, since there are potentially millions of pages that could be queued, and there is no guarantee that a user wants to continue the crawl beyond the initial limit. Hope this helps explain the rationale!

Will close this for now, will make a new issue if we decide to explore a different type of page limit in the future.

gitreich · 2024-04-19T07:41:55Z

Thank you all for your explonations it's now clear to me
I added it to the Docs via a9803b2

github-project-automation bot added this to Webrecorder Projects Apr 18, 2024

github-project-automation bot moved this to Triage in Webrecorder Projects Apr 18, 2024

ikreymer closed this as completed Apr 18, 2024

github-project-automation bot moved this from Triage to Done! in Webrecorder Projects Apr 18, 2024

tw4l mentioned this issue May 27, 2024

Better indicate the interruption reason #584

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker Exit Code 11 for time and size Limit #549

Docker Exit Code 11 for time and size Limit #549

gitreich commented Apr 18, 2024

rgaudin commented Apr 18, 2024

gitreich commented Apr 18, 2024

rgaudin commented Apr 18, 2024

tw4l commented Apr 18, 2024

ikreymer commented Apr 18, 2024

gitreich commented Apr 19, 2024

Docker Exit Code 11 for time and size Limit #549

Docker Exit Code 11 for time and size Limit #549

Comments

gitreich commented Apr 18, 2024

rgaudin commented Apr 18, 2024

gitreich commented Apr 18, 2024

rgaudin commented Apr 18, 2024

tw4l commented Apr 18, 2024

ikreymer commented Apr 18, 2024

gitreich commented Apr 19, 2024