Skip to content

Revert "[8.19] (backport #17154) tbs: Update storage metrics to be reported synchronously in the existing runDiskUsageLoop method" #17644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 15, 2025

Conversation

isaacaflores2
Copy link
Contributor

@isaacaflores2 isaacaflores2 commented Jul 14, 2025

These changes attempted to fix reporting issue for lsm_size metrics where we believed were caused by the use of Observable OTEL metrics. The changes introduced a regression where we observed the lsm_size metrics less frequently than before.

After updating metrics collection to remove conflicts (#17525), we determined this was the root issue that was contributing to the infrequent lsm_size metrics (not the Observable OTEL metrics). The changes in this PR fixed the regression mention above.

The prior implementation with Observable metrics is preferred so we will revert #17275

Draft PR:

  • Validate stats endpoint locally
  • Add manual test steps

How to test these changes

  1. Run apm-server with TBS enabled
  2. Send data to the server ./sendotlp -insecure -endpoint=http://localhost:8200 -secret-token=<token>
  3. View metrics via the stats endpoint (http://localhost:5066/stats?pretty) . storage should be visible along with other metrics under apm-server.sampling.tail.
  • Call the stats endpoint multiple times. All sets of metrics should be consistently visible
"sampling": {
		"tail": {
			"dynamic_service_groups": 0,
			"events": {
				"processed": 3,
				"sampled": 6,
				"stored": 3
			},
			"storage": {
				"lsm_size": 7431,
				"value_log_size": 0
			}
		}
	}

apm server config

apm-server:
  host: "127.0.0.1:8200"

  sampling.tail:
    enabled: true
    policies:
      - sample_rate: 1.0

output.elasticsearch:
  hosts: ["https://https://cloud-url:443"]
  username: "elastic"
  password: "<REDACTED>"

http:
  enabled: true
  host: localhost
  port: 5066
  
monitoring.enabled: true
monitoring.elasticsearch:
  hosts: [ "https://cloud-url:443" ]
  username: "elastic"
  password: "<REDACTED>"  

Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@isaacaflores2 isaacaflores2 marked this pull request as ready for review July 14, 2025 23:51
@isaacaflores2 isaacaflores2 requested a review from a team as a code owner July 14, 2025 23:51
@isaacaflores2 isaacaflores2 merged commit d100231 into 8.19 Jul 15, 2025
16 checks passed
@isaacaflores2 isaacaflores2 deleted the revert-17275-mergify/bp/8.19/pr-17154 branch July 15, 2025 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants