-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sidecar: Do not crash when Object Storage is not accessible #7585
Comments
I think it is a valid issue. Help wanted. |
…nos-io#7585 The goal is to allow sidecar to start to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This commit brings a new counter to alert in case of buck initialization crash: `thanos_sidecar_upload_failures_total{reason="bucket_initialization"}` Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
…nos-io#7585 The goal is to allow sidecar to start to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This commit brings a new counter to alert in case of buck initialization crash: `thanos_sidecar_upload_failures_total{reason="bucket_initialization"}` Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
…nos-io#7585 The goal is to allow sidecar to start to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This commit brings a new counter to alert in case of buck initialization crash: `thanos_sidecar_upload_failures_total{reason="bucket_initialization"}` Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
…nos-io#7585 The goal is to allow sidecar to start to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This commit brings a new counter to alert in case of buck initialization crash: `thanos_sidecar_upload_failures_total{reason="bucket_initialization"}` Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
…nos-io#7585 This commit allows sidecar to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This commit brings a new metric to alert in case of bucket initialization crash: thanos_sidecar_shipper_up Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
…nos-io#7585 This commit allows sidecar to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This commit brings a new metric to alert in case of bucket initialization crash: thanos_sidecar_shipper_up Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
After a discussion with @MichaHoffmann, we came to realise that sidecar crashing can be useful for some users that rely on it to "detect" when something is wrong (like an uninitialised S3 bucket). While it was suggested to add a metric to alert on the situation, such situations go could unnoticed. I suggest to let sidecar crash by default and add an option to allow sidecar to continue to serve prometheus read path even if the objstore is not working. |
…7585 Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
This commit allows sidecar to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This option is disabled by default and can be enabled by passing argument "--shipper.retry-init". This commit also brings a new metric to alert in case of bucket initialization crash: thanos_sidecar_shipper_up Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
This commit allows sidecar to continue to serve prometheus read path if objstore is not available at startup. Bucket creation will be attempted again on next upload. This option is disabled by default and can be enabled by passing argument "--shipper.retry-init". This commit also brings a new metric to alert in case of bucket initialization crash: thanos_sidecar_shipper_up Signed-off-by: Amaury Decrême <amaury.decreme@gmail.com>
Is your proposal related to a problem?
Also related to objstore project.
We had a network outage accessing our storage endpoint. (DNS failure)
when sidecar restarted it then go into crashloop with :
While we consider objectstorage for long term metrics only, we would like sidecar to continue to serve prometheus read path and not crash.
Describe the solution you'd like
Could this error become a warning. And we would alert on a failing metrics or so instead of crashing.
Additional context
Thanos v0.35.0
ObjStore Azure
The text was updated successfully, but these errors were encountered: