Skip to content

[8.19] (backport #17512) monitoring: update apm-server metrics collection to avoid conflicts #17525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 14, 2025

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Jul 8, 2025

Motivation/summary

Currently apm-server.sampling.tail.storage.lsm_size and apm-server.sampling.tail.dynamic_service_groups are never reported together.

Testing locally this is caused because both sets of metrics use the same namespace (metric name prefix) apm-server.sampling but they are created using different instances of a Meter.

The related monitoring func calls addAPMServerMetrics multiple times for each scoped metric. Metric names with the same prefix in different scopes are somehow overwriting each other. This approach opts to collect all "apm-server" metrics and add them to the snapshot once. Another approach would be to update the elastic-agent-lib to prevent metrics from overwriting each other.

Checklist

  • View individual metrics documents to confirm reported metrics are correct

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

  1. Unit test cover the expected behavior

manual test

  1. Run apm-server with TBS enabled
  2. Send data to the server ./sendotlp -insecure -endpoint=http://localhost:8200 -secret-token=<token>
  3. View metrics via the stats endpoint (http://localhost:5066/stats?pretty) . storage should be visible along with other metrics under apm-server.sampling.tail
"sampling": {
		"tail": {
			"dynamic_service_groups": 0,
			"events": {
				"processed": 3,
				"sampled": 6,
				"stored": 3
			},
			"storage": {
				"lsm_size": 7431,
				"value_log_size": 0
			}
		}
	}

Related issues

Closes #17342
Alternate approach to #17427


This is an automatic backport of pull request #17512 done by [Mergify](https://mergify.com).

…17512)

* monitoring: update apm-server metrics collection to avoid conflicts

* refactor beats monitoring since there are no more global registries

* removed redundant temp slice from apm-server monitoring func

(cherry picked from commit f1c279b)

# Conflicts:
#	internal/beatcmd/beat_test.go
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Jul 8, 2025
@mergify mergify bot requested a review from a team as a code owner July 8, 2025 23:02
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Jul 8, 2025
Copy link
Contributor Author

mergify bot commented Jul 8, 2025

Cherry-pick of f1c279b has failed:

On branch mergify/bp/8.19/pr-17512
Your branch is up to date with 'origin/8.19'.

You are currently cherry-picking commit f1c279b6.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   internal/beatcmd/beat.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   internal/beatcmd/beat_test.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Copy link
Contributor

github-actions bot commented Jul 8, 2025

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@isaacaflores2 isaacaflores2 enabled auto-merge (squash) July 9, 2025 00:22
Copy link
Contributor Author

mergify bot commented Jul 9, 2025

This pull request has been removed from the queue for the following reason: checks failed.

The merge conditions cannot be satisfied due to failing checks:

You may have to fix your CI before adding the pull request to the queue again.
If you update this pull request, to fix the CI, it will automatically be requeued once the queue conditions match again.
If you think this was a flaky issue instead, you can requeue the pull request, without updating it, by posting a @mergifyio requeue comment.

@kruskall
Copy link
Member

@Mergifyio queue

Copy link
Contributor Author

mergify bot commented Jul 12, 2025

queue

🛑 The pull request has been removed from the queue default

The merge conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details about the failure.

Copy link
Contributor Author

mergify bot commented Jul 14, 2025

This pull request has not been merged yet. Could you please review and merge it @isaacaflores2? 🙏

@kruskall
Copy link
Member

@Mergifyio update

Copy link
Contributor Author

mergify bot commented Jul 14, 2025

update

✅ Branch has been successfully updated

@kruskall kruskall removed the conflicts There is a conflict in the backported pull request label Jul 14, 2025
mergify bot added a commit that referenced this pull request Jul 14, 2025
@mergify mergify bot merged commit 6c690bd into 8.19 Jul 14, 2025
16 checks passed
@mergify mergify bot deleted the mergify/bp/8.19/pr-17512 branch July 14, 2025 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants