apm: Document sampling.tail.discard_on_write_failure config #1453

isaacaflores2 · 2025-05-22T00:08:43Z

⚠️ HOLD FOR 9.1 ⚠️
Document sampling.tail.discard_on_write_failure config.

I sourced the config explanation from here please let me know if the description is incorrect or unclear in any way.

Updated pages can be found in the docs preview here:

Checklist

Wait for PR apm: Document sampling.tail.ttl config #1269 to be merged and incorporate changes.

Related issues

Part of elastic/apm-server#15330

isaacaflores2 · 2025-05-22T00:12:59Z

reference/apm/cloud/apm-settings.md

@@ -53,6 +53,10 @@ If a setting is not supported by {{ech}}, you will get an error message when you
 Some settings that could break your cluster if set incorrectly are blocklisted. The following settings are generally safe in cloud environments. For detailed information about APM settings, check the [APM documentation](/solutions/observability/apm/configure-apm-server.md).
 ::::

+### Version 9.1+ [ec_version_9_1]


This config also applies to 8.19+ but I left it out based on @carsonip comment in another PR. Let me know if I should add 8.19+.

@florent-leborgne @colleenmcginnis My initial plan was to backport this PR to 8.X branch for the 8.19 release (and change the versions from 9.1 to 8.19). But I just realized 8.19 is being released before 9.1.

Should I create a separate PR for 8.X? Or do you have any other suggestions? Thanks

Hey @isaacaflores2. Thanks for this PR. You would need a different PR anyways for 8.19 docs because:

the content is likely in a different repository (https://github.com/elastic/observability-docs)

8.x docs, including 8.19, are still powered by the asciidoc-based system, while 9.0 docs and above like this PR are markdown-based.

I am happy to help if you need

Got it thanks for sharing. I will start a PR for 8.19 docs in the other repo. I'll reach out on slack for any help

carsonip

lgtm, a nit on config description. Please hold off from merging until 9.1 release

solutions/observability/apm/tail-based-sampling.md

solutions/observability/apm/transaction-sampling.md

florent-leborgne

Thanks for the addition! I left some minor-ish styling suggestions to align the wording with our writing guidelines.

florent-leborgne · 2025-05-26T07:20:23Z

reference/apm/cloud/apm-settings.md

@@ -53,6 +53,10 @@ If a setting is not supported by {{ech}}, you will get an error message when you
 Some settings that could break your cluster if set incorrectly are blocklisted. The following settings are generally safe in cloud environments. For detailed information about APM settings, check the [APM documentation](/solutions/observability/apm/configure-apm-server.md).
 ::::

+### Version 9.1+ [ec_version_9_1]


For other version sections, we specify that These are all of the supported settings for this version:. If providing the full list may be out of scope of this PR, is it possible to at least outline the changes? I assume apm-server.sampling.tail.discard_on_write_failure is a newly supported setting, but are there more changes, if you know?

Suggested change

### Version 9.1+ [ec_version_9_1]

### Version 9.1+ [ec_version_9_1]

This {{stack}} version adds support for the following settings:

9.1 will have 2 more configs than 9.0. One mentioned here, another in #1269 . I agree that explicitly mentioning these are new configs on top of 9.0 would be useful.

On a side note as a heads-up, before we spend too much time polishing this doc, I'm also thinking removing this doc altogether since it isn't providing much value after being moved from cloud to apm: elastic/apm-server#13602

Thanks for the review @florent-leborgne. I updated to specify the stack versions adds new configs.

solutions/observability/apm/configure-apm-server.md

florent-leborgne · 2025-05-26T07:26:57Z

reference/apm/cloud/apm-settings.md

+### Version 9.1+ [ec_version_9_1]
+
+`apm-server.sampling.tail.discard_on_write_failure`
+:   Defines the indexing behavior when trace events fail to be written to storage (e.g. when the storage limit is reached). When set to `false`, traces will bypass sampling and always be indexed, significantly increasing the indexing load. When set to `true`, traces will be discarded, there will be data loss potentially resulting in broken traces. The default is `false`. 


Suggested change

: Defines the indexing behavior when trace events fail to be written to storage (e.g. when the storage limit is reached). When set to `false`, traces will bypass sampling and always be indexed, significantly increasing the indexing load. When set to `true`, traces will be discarded, there will be data loss potentially resulting in broken traces. The default is `false`.

: Defines the indexing behavior when trace events fail to be written to storage (for example, when the storage limit is reached). When set to `false`, traces bypass sampling and are always indexed, which significantly increases the indexing load. When set to `true`, traces are discarded, causing data loss which can result in broken traces. The default is `false`.

Re-styling to present tense as per writing guidelines

Thanks I updated all descriptions to use present tense

solutions/observability/apm/configure-apm-server.md

solutions/observability/apm/tail-based-sampling.md

solutions/observability/apm/transaction-sampling.md

…idelines

github-actions · 2025-06-23T17:39:37Z

🔍 Preview links for changed docs:

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

florent-leborgne

LGTM.
Just a note that we might refactor this type of content into metadata-enriched tabs in the future to lay out this type of "per version" differences.

carsonip · 2025-07-04T17:45:53Z

solutions/observability/apm/configure-apm-server.md

+### Version 9.1+ [ec_version_9_1]
+This {{stack}} version adds support for the following settings:
+
+`apm-server.sampling.tail.discard_on_write_failure`
+:   Defines the indexing behavior when trace events fail to be written to storage (for example, when the storage limit is reached). When set to `false`, traces bypass sampling and are always indexed, which significantly increases the indexing load. When set to `true`, traces are discarded, causing data loss which can result in broken traces. The default is `false`.
+


Suggested change

### Version 9.1+ [ec_version_9_1]

This {{stack}} version adds support for the following settings:

`apm-server.sampling.tail.discard_on_write_failure`

: Defines the indexing behavior when trace events fail to be written to storage (for example, when the storage limit is reached). When set to `false`, traces bypass sampling and are always indexed, which significantly increases the indexing load. When set to `true`, traces are discarded, causing data loss which can result in broken traces. The default is `false`.

Can be removed thanks to elastic/apm-server#13602

carsonip · 2025-07-04T17:46:19Z

reference/apm/cloud/apm-settings.md

+### Version 9.1+ [ec_version_9_1]
+This {{stack}} version adds support for the following settings:
+
+`apm-server.sampling.tail.discard_on_write_failure`
+:   Defines the indexing behavior when trace events fail to be written to storage (for example, when the storage limit is reached). When set to `false`, traces bypass sampling and are always indexed, which significantly increases the indexing load. When set to `true`, traces are discarded, causing data loss which can result in broken traces. The default is `false`.


Suggested change

### Version 9.1+ [ec_version_9_1]

This {{stack}} version adds support for the following settings:

`apm-server.sampling.tail.discard_on_write_failure`

: Defines the indexing behavior when trace events fail to be written to storage (for example, when the storage limit is reached). When set to `false`, traces bypass sampling and are always indexed, which significantly increases the indexing load. When set to `true`, traces are discarded, causing data loss which can result in broken traces. The default is `false`.

Can be removed thanks to elastic/apm-server#13602

carsonip · 2025-07-04T17:46:41Z

solutions/observability/apm/tail-based-sampling.md

+
+|                              |                                          |
+|------------------------------|------------------------------------------|
+| APM Server binary            | `sampling.tail.discard_on_write_failure` |


apm: Document sampling.tail.discard_on_write_failure config

2ebbf56

isaacaflores2 requested review from a team as code owners May 22, 2025 00:08

github-actions bot deployed to docs-preview May 22, 2025 00:09 View deployment

isaacaflores2 commented May 22, 2025

View reviewed changes

carsonip approved these changes May 22, 2025

View reviewed changes

solutions/observability/apm/tail-based-sampling.md Outdated Show resolved Hide resolved

carsonip requested a review from colleenmcginnis May 22, 2025 09:38

isaacaflores2 added 2 commits May 22, 2025 13:42

apm: specify sampling bypass when discard_on_write_failure is false

76fadaa

apm: add discard_on_write_failure note to transaction-sampling.md

ad5d6c0

github-actions bot deployed to docs-preview May 22, 2025 20:44 View deployment

isaacaflores2 commented May 22, 2025

View reviewed changes

solutions/observability/apm/transaction-sampling.md Outdated Show resolved Hide resolved

carsonip approved these changes May 23, 2025

View reviewed changes

This was referenced May 23, 2025

[APM] Support sampling discard_on_write_failure configuration in apm integration policy elastic/kibana#221441

Closed

TBS: Document discard_on_write_failure + expose it to the APM Integration elastic/apm-server#15330

Open

florent-leborgne reviewed May 26, 2025

View reviewed changes

apm: update discard_on_write_failure descriptions to match writing gu…

5f874ab

…idelines

github-actions bot deployed to docs-preview June 23, 2025 17:39 View deployment

isaacaflores2 mentioned this pull request Jun 27, 2025

[apm]: Document sampling.tail.discard_on_write_failure config elastic/observability-docs#4908

Open

10 tasks

Merge branch 'main' into tbs-config-discard-on-write

7041c81

github-actions bot deployed to docs-preview June 27, 2025 22:02 View deployment

florent-leborgne approved these changes Jun 30, 2025

View reviewed changes

carsonip reviewed Jul 4, 2025

View reviewed changes

	: Defines the indexing behavior when trace events fail to be written to storage (e.g. when the storage limit is reached). When set to `false`, traces will bypass sampling and always be indexed, significantly increasing the indexing load. When set to `true`, traces will be discarded, there will be data loss potentially resulting in broken traces. The default is `false`.
	: Defines the indexing behavior when trace events fail to be written to storage (for example, when the storage limit is reached). When set to `false`, traces bypass sampling and are always indexed, which significantly increases the indexing load. When set to `true`, traces are discarded, causing data loss which can result in broken traces. The default is `false`.

	\| APM Server binary \| `sampling.tail.discard_on_write_failure` \|
	\| APM Server binary \| `apm-server.sampling.tail.discard_on_write_failure` \|

apm: Document sampling.tail.discard_on_write_failure config #1453

Are you sure you want to change the base?

apm: Document sampling.tail.discard_on_write_failure config #1453

Uh oh!

Conversation

isaacaflores2 commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Related issues

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

isaacaflores2 Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carsonip left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

florent-leborgne left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

florent-leborgne left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

isaacaflores2 commented May 22, 2025 •

edited

Loading

isaacaflores2 Jun 27, 2025 •

edited

Loading

github-actions bot commented Jun 23, 2025 •

edited

Loading